Category: data center
Do Enterprises Need VRFs?
One of my readers sent me a long of questions titled “Do enterprise customers REALLY need VRFs?”
The only answer I could give is “it depends” (it’s like asking “Do animals need wings?”), and here’s my attempt at building a decision tree:
You can use the decision tree to figure out whether you need VRFs in your data center or in your enterprise WAN.
Save the date: Leaf-and-Spine Fabric Design Workshop in Zurich
Do you believe in vendor-supplied black box (regardless of whether you call it ACI or SDDC) or in building your own data center fabric using solid design principles?
It should be an easy choice if believe a business should control its own destiny instead of being pulled around by vendor marketing (to paraphrase Russ White)
Do I Need Redundant Firewalls?
One of my readers sent me this question:
I often see designs involving several more than 2 DCs spread over different locations. I was actually wondering if that makes sense to bring high availability inside the DC while there's redundancy in place between the DCs. For example, is there a good reason to put a cluster of firewalls in a DC, when it is possible to quickly fail over to another available DC, as a redundant cluster increases costs, licenses and complexity.
Rule#1 of good engineering: Know Your Problem ;) In this particular case:
Optimize Your Data Center: Virtual Appliances
We got pretty far in our Data Center optimization journey. We virtualized the workload, got rid of legacy technologies, and reduced the number of server uplinks and replaced storage arrays with distributed file system.
Final step on the journey: replace physical firewalls and load balancers with virtual appliances.
Ansible versus Puppet in Initial Device Provisioning
One of the attendees of my Building Next-Generation Data Center course asked this interesting question after listening to my description of differences between Chet/Puppet and Ansible:
For Zero-Touch Provisioning to work, an agent gets installed on the box as a boot up process that would contact the master indicating the box is up and install necessary configuration. How does this work with agent-less approach such as Ansible?
Here’s the first glitch: many network devices don’t ship with Puppet or Chef agent; you have to install it during the provisioning process.
Use VRFs to Solve Routing-on-Hosts Challenges
One of my readers sent me interesting feedback after reading my explanation of why I’d try not to use OSPF as a routing protocol between hosts and ToR switches. He said:
Unfortunately we can’t use BGP because IBM mainframes support only OSPF or RIP, so we decided to use VRFs instead.
Here’s what they did:
Replacing FabricPath with VXLAN, EVPN or ACI?
One of my friends plans to replace existing FabricPath data center infrastructure, and asked whether it would make sense to stay with FabricPath (using the new Nexus 5600 switches) or migrate to ACI.
I proposed a third option: go with simple VXLAN encapsulation on Nexus 9000 switches. Here’s why:
How Many vMotion Events Can You Expect in a Data Center?
One of my friends sent me this question:
How many VM moves do you see in a medium and how many in a large data center environment per second and per minute? What would be a reasonable maximum?
Obviously the answer to the first part is it depends (please share your experience in the comments), so we’ll focus on the second one. It’s time for another Fermi estimate.
Why Would I Use BGP and not OSPF between Servers and the Network?
While we were preparing for the Cumulus Networks’ Routing on Hosts webinar Dinesh Dutt sent me a message along these lines:
You categorically reject the use of OSPF, but we have a couple of customers using it quite happily. I’m sure you have good reasons, and the reasons you list [in the presentation] are ones I agree with. OTOH, why not use totally stubby areas with hosts in such an area?
How about:
The Cost of Networking Has Not Declined
One of the common taglines parroted by SDN aficionados goes along the lines of “The cost to acquire and manage server and storage architectures has declined over time while networking stays stubbornly expensive.” (I took it straight from an anonymous blog comment).
Let’s see how well it matches reality.
Whitebox Switching at LinkedIn with Russ White on Software Gone Wild
When LinkedIn announced their Project Falco I knew exactly what one of my future Software Gone Wild podcasts would be: a chat with Russ White (Mr. CCDE, now network architect @ LinkedIn).
It took us a long while (and then the summer break intervened) but I finally got it published: Episode 62 is waiting for you.
Running BGP between Virtual Machine and ToR Switch
One of my readers left this question on the blog post resurfacing the idea of running BGP between servers and ToR switches:
When using BGP on a VM for mobility, what is the best way to establish a peer relationship with a new TOR switch after a live migration? The VM won't inherently know the peer address or the ASN.
As always, the correct answer is it depends.
Using BGP in Leaf-and-Spine Fabrics
In the Leaf-and-Spine Fabric Designs webinar series we started with the simplest possible design: non-redundant server connectivity with bridging within a ToR switch and routing across the fabric.
After I explained the basics (including routing protocol selection, route summarization, link aggregation and addressing guidelines), Dinesh Dutt described how network architects use BGP when building leaf-and-spine fabrics.
Why Is Stretched ACI Infinitely Better than OTV?
Eluehike Chedu asked an interesting question after my explanation of why stretched ACI fabric (or alternatives, see below) is the least horrible way of stretching a subnet: What about OTV?
Time to go back to the basics. As Dinesh Dutt explained in our Routing on Hosts webinar, there are (at least) three reasons why people want to see stretched subnets:
Planning for Migration into the Cloud?
One of my readers sent me this question:
Have you written something about assessment and planning for migration of traditional in-premise data center network to private or public cloud? There would be hundreds of things to check during assessment and then plan accordingly.
Academically, that’s a wrong way of approaching the problem.