Category: data center
Data Center BGP: Autonomous Systems and AS Numbers
Two weeks ago we discussed whether it makes sense to use BGP as the routing protocol in a data center fabric. Today we’ll tackle three additional design challenges:
- Should you use IBGP or EBGP?
- When should you run BGP on the spine switches?
- Should every leaf switch have a different AS number or should they share the same AS number?
Video: Switch Buffer Architectures
A while ago (in the time of big-versus-small buffers brouhaha), I asked JR Rivers to do a short presentation focusing on buffering requirements of data center switches. He started by describing typical buffer architectures you might find in data center switches.
BGP as a Better IGP? When and Where?
A while ago I helped a large enterprise redesign their data center fabric. They did a wonderful job optimizing their infrastructure, so all they really needed were two switches in each location.
Some vendors couldn’t fathom that. One of them proposed to build a “future-proof” (and twice as expensive) leaf-and-spine fabric with two leaves and two spines. On top of that they proposed to use EBGP as the only routing protocol because draft-lapukhov-bgp-routing-large-dc – a clear case of missing the customer needs.
Security or Convenience, That’s the Question
One of my readers was so delighted that something finally happened after I wrote about a NX-OS bug that he sent me a pointer to another one that has been pending for a long while, and is now officially terminated as FAD (Functions-as-Designed… even documented in the Further Problem Description).
Here’s what he wrote (slightly reworded)…
Let’s Pretend We Run Distributed Storage over a Thick Yellow Cable
One of my friends wanted to design a nice-and-easy layer-3 leaf-and-spine fabric for a new data center, and got blindsided by a hyperconverged vendor. Here’s what he wrote:
We wanted to have a spine/leaf L3 topology for an NSX deployment but can’t do that because the Nutanix servers require L2 between their nodes so they can be in the same cluster.
I wanted to check his claims, but Nutanix doesn’t publish their documentation (I would consider that a red flag), so I’m assuming he’s right until someone proves otherwise (note: whitepaper is not a proof of anything ;).
Optimize Data Center Infrastructure: Build an Optimized Fabric
I published the last part of my Optimize Data Center Infrastructure series: build an optimized data center fabric.
To learn more about data center fabric designs, check the new online course or enroll into the Spring 2018 session of Building Next-Generation Data Center course.
Pluribus Networks… 2 Years Later
I first met Pluribus Networks 2.5 years ago during their Networking Field Day 9 presentation, which turned controversial enough that I was advised not to wear the same sweater during NFD16 to avoid jinxing another presentation (I also admit to be a bit biased in those days based on marketing deja-moo from a Pluribus sales guy I’d been exposed to during a customer engagement).
Pluribus NFD16 presentations were better; here’s what I got from them:
Another Reason to Run Linux on Your Data Center Switches
Arista’s OpenFlow implementation doesn’t support TLS encryption. Usually that’s not a big deal, as there aren’t that many customers using OpenFlow anyway, and those that do hopefully do it over a well-protected management network.
However, lack of OpenFlow TLS encryption might become an RFP showstopper… not because the customer would really need it but because the customer is in CYA mode (we don’t know what this feature is or why we’d use it, but it might be handy in a decade, so we must have it now) or because someone wants to eliminate certain vendors based on some obscure missing feature.
Video: Separate Data from Code
After explaining the challenges of data center fabric deployments, Dinesh Dutt focused on a very important topic I covered in Week#3 of the Building Network Automation Solutions online course: how do you separate data (data model describing data center fabric) from code (Ansible playbooks and device configurations)
Create a VLAN Map from Network Operational Data
It’s always great to see students enrolled in Building Network Automation Solutions online course using ideas from my sample playbooks to implement a wonderful solution that solves a real-life problem.
James McCutcheon did exactly that: he took my LLDP-to-Graph playbook and used it to graph VLANs stretching across multiple switches (and provided a good description of his solution).
Update: Cisco Nexus Switches
Third vendor in this year’s series of data center switching updates: Cisco.
As expected, Cisco launched a number of new switches in 2017, and EOL’d older models … for pretty varying value of old. For example, most of the original Nexus 9300 models are gone.
Video: Data Center Fabric Validation
Validating the expected network behavior is (according to the intent-driven pundits) a fundamental difference that makes intent-driven products more than glorified orchestration systems.
Guess what: smart people knew that for ages and validated their deployments even when using simple tools like Ansible playbooks.
Dinesh Dutt explained how he validates data center fabric deployment during the Network Automation Use Cases webinar; I’m doing something similar in my OSPF deployment playbooks (described in detail in Ansible online course).
Upgrading Virtual Appliances
In every SDDC workshop I tried to persuade the audience that the virtual appliances (particularly per-application instances of virtual appliances) are the way to go. I usually got the questions along the lines of “who will manage and audit all these instances?” but once someone asked “and how will we upgrade them?”
Short answer: you won’t.
Video: Building a Pure Layer-3 Data Center with Cumulus Linux
One of the design scenarios we covered in Leaf-and-Spine Fabric Architectures webinar is a pure layer-3 data center, and in the “how do I do this” part of that section Dinesh Dutt talked about the details you need to know to get this idea implemented on Cumulus Linux.
We covered a half-dozen design scenarios in that webinar; for an even wider picture check out the new Designing and Building Data Center Fabrics online course.
Update: Brocade Data Center Switches
Second vendor in this year’s series of data center switching updates: Brocade.
Not much has happened on this front since last year’s update. There was a maintenance release of Brocade NOS, they launched SLX series of switches, but those are so new that the software documentation didn’t have time to make it to the usual place (document library for individual switch models), it's here.
In any case, the updated videos (including edited 2016 content which describes IP Fabric in great details) are online. You can access them if you bought the webinar recording in the past or if you have an active ipSpace.net subscription.