Blog Posts in February 2013
Finally found time to list all the events I’m scheduled to present at in the next few months. Hope you understand why I might occasionally miss one of the daily blog posts … but if you’re attending one of those events, please do send me an email or look for me there and we’ll have a nice cup of conversation (as one of my old friends used to say).
Smaller Clos fabrics are built with two layers of switches: leaf and spine switches. The oversubscription ratio you want to achieve dictates the number of uplinks on the leaf switch, which in turn dictates the maximum number of spine switches and thus the fabric size.
You have to use multi-stage Clos architecture if you want to build bigger fabrics; Brad Hedlund described a sample fabric with over 24.000 server-facing ports in the Clos Fabrics Explained webinar.
Another day, another interesting Expert Express engagement, another stretched layer-2 design solving the usual requirement: “We need inter-DC VM mobility.”
The usual question: “And why would you want to vMotion a VM between data centers?” with a refreshing answer: “Oh, no, that would not work for us.”
Virtual tenant networks are one of the best features of NEC ProgrammableFlow solution – you can build virtual layer-2 subnets (based on VLANs, edge ports or port/VLAN combos), connect them with a virtual router, and implement packet filters and traffic steering ... while treating the whole data center fabric as a single device.
Even better, the ingress edge switch performs all the operations you configure (ACLs, L2 lookup, L3 lookup, source/destination MAC rewrite), resulting in optimal end-to-end forwarding.
You’ve probably heard that the networking hardware vendors decided to pool resources to create an open-source OpenFlow controller. Just in case you’re wondering whether they lost their mind (no, they didn’t), here’s my cynical take.
A while ago I got an interesting question:
Let's say that due to circumstances outside of your control, you must have stretched data center subnets... What is the best method to get these subnets into OSPF? Should they share a common area at each data center or should each data center utilize a separate area for the same subnet?
Assuming someone hasn’t sprinkled the application willy-nilly across the two data centers, it’s best if the data center edge routers advertise subnets used by the applications as type-2 external routes, ensuring one data center is always the primary entry point for a specific subnet. Getting the same results with BGP routing in Internet is a much tougher challenge.
I’m probably flogging a fossilized skeleton of a long-dead horse, but it seems I never wrote about this topic before, so here it is (and you might want to read this book for more details).
Process switching is the oldest, simplest and slowest packet forwarding mechanism. Packets received on an interface trigger an interrupt, the interrupt handler identifies the layer-3 protocol based on layer-2 packet headers (example: Ethertype in Ethernet packets) and queues the packets to (user mode) packet forwarding processes (IP Input and IPv6 Input processes in Cisco IOS).
The ProgrammableFlow Principles of Operations section of the ProgrammableFlow Technical Deep Dive webinar generated tens of questions from the audience – it took us almost 20 minutes to answer all of them (note: you might watch the answers after watching the section that triggered the questions).
The second part of ProgrammableFlow Technical Deep Dive webinar describes ProgrammableFlow principles – physical networks (control, data, HA cluster, management), edge and core switches, topology discovery and end-to-end routing and path setup procedure.
Matt Thompson provided a really good answer to the “what’s acceptable oversubscription ratio in a ToR switch” when he wrote “I’m expecting a ‘how long is a piece of string’ answer” (note: do watch the BBC video answering that one).
There’s the 3:1 rule-of-thumb recipe, with a more realistic answer being “it depends”. Now let’s see if we can go beyond that without a deep dive into scholastic waters.
Matthew Stone encountered another unintended consequence of full Internet routing in a VRF design: the TCAM on his 6500 was 80% utilized even though he has the new Sup modules with one million IPv4 routes.
A closer look revealed the first clue: L3 forwarding resources on a Cat6500 are shared between IPv4 routes and MPLS labels (don’t know about you, but I was not aware of that) and half the entries were consumed by MPLS labels:
Mark Berly spent plenty of time explaining the in-depth intricacies of VXLAN-to-VLAN gateways during our VXLAN Technical Deep Dive webinar. He’s obviously heavily immersed in this challenge and hits 9+ on the Nerd Meter, so you might have to watch the video a few times to get all the nuances. What can I say – we’ll have fun times in the coming years ;)
... in a VXLAN segment (over UDP-over-IP-over-EoMPLS-over-MPLS-over-GRE-over-IPsec-over-IP-over-MPLS/VPN-over-MPLS-over-PPPoE-over-ATM-over-ATOM-over ...)
Source: DevOps reactions
I don’t want to publish the “best of the year” statistics before a month or two has elapsed, as the blog posts tend to accumulate views long after they’ve been published. Assuming you read all the content from 2012 you were interested in, here are the results:
The clear winner is (not surprisingly) Does CCIE still make sense?, and here’s the rest of the top-10 list:
Here’s the first video from the ProgrammableFlow webinar. Don’t get too excited, it’s yet another introduction to OpenFlow and control/data plane separation (but it is an animated one ;). More goodies to follow ...
I had an interesting conversation with Doug Hanks (@douglashanksjr) about the need for intra-spine links in leaf-and-spine fabric designs. You clearly don’t need links between spine switches when every leaf node (switch or router/firewall/load balancer) is connected to all spine switches ... but what happens when one of the leaf-to-spine links fails? Will other leaf switches know that they have to avoid the spine switch with the failed link?
Cisco launched two new data center switches on Monday: Nexus 6001, a 1RU ToR switch with the exact same port configuration as any other ToR switch on the market (48 x 10GE, 4 x 40GE usable as 16 x 10GE) and Nexus 6004, a monster spine switch with 96 40GE ports (it has the same bandwidth as Arista’s 7508 in a 4RU form factor and three times as many 40GE ports as Dell Force10 Z9000).
Apart from slightly higher port density, Nexus 6001 looks almost like Nexus 5548 (which has 48 10GE ports) or Nexus 3064X. So where’s the beef?
Of course he’s right and while, as Bob Plankers explains, you can never escape some lock-in (part 1, response from Greg Ferro, part 2 – all definitely worth reading), you do have to ask yourself “am I looking for Windows or Mac?”
Martin Bernier has decided to open another can of IPv6 worms: how do you address multiple subnets in a very typical setup where you use a firewall (example: ASA) to connect a SMB network to the outside world?
During the IPv6 Security webinar Eric Vyncke explained the intricate details of IPv6 Security Neighbor Discovery (SEND) and the reasons it will probably never take off.
If you happen to have masochistic tendencies and too much time, please do use SEND; it’s been available on Cisco IOS for a while, and there are “plenty” of host implementations to choose from.