Blog Posts in October 2013
If you need more details or an in-depth evaluation of products from numerous vendors, check out the Overlay Virtual Networking webinar (the final videos have just been published).
Last week I described how Cisco Modeling Lab (CML, the product formerly known as VIRL) works behind its fantastic UI, and promised more information about the UI once I get access to a preview version of CML, which I got a few days ago. Here are the results of the first brief stroll down the virtual lane.
A group of researches presented an “interesting” result @ IETF 87: migrating from IBGP full mesh to IBGP reflectors can introduce temporary forwarding loops. OMG, really?
Don’t panic, the world is not about to become a Vogon hyperspace bypass. Let’s put their results in perspective.
Every good data center presentation starts with redefining The Problem and my VMware NSX Architecture webinar was no exception – the first section describes Infrastructure-as-a-Service Networking Requirements.
I sprinted through this section during the live session, the video with longer (and more detailed) explanation comes from the Overlay Virtual Networking webinar.
The first hints of VIRL started appearing around Cisco Live US 2013 where the product development team demonstrated Cisco’s take on 21st century network modeling tool. A few days ago, Omar Sultan, Joel Obstfeld and Ed Kern gave us a brief peek behind the scenes of this totally awesome tool (note to Cisco haters: I haven’t been drinking the teal Kool-Aid for a long time – this is my honest impression).
Ashton Bothman from Juniper invited me to an interesting contest: build a Lego Data Center. I just happen to have in-house Lego Design Experts (read: kids), so I gladly delegated the task to that team. Here are the results (using the Force instead of unicorns).
An odd idea stroke me when watching the Avalanche NEXT presentation during Networking Tech Field Day – they have a fuzzing module that you can use to test whether your servers and applications survive all sorts of crazy illegal requests. Could that be used to detect SQL injection vulnerabilities in your web apps?
The number of flows in hardware switches (dictated by the underlying TCAM size) is one of the major roadblocks in a large-scale OpenFlow deployment. Vendors are supposedly making progress, with Intel claiming up to 4000 12-tuple flow entries in their new Ethernet Switch FM6700 series. Is that good enough? As always, it depends.
TL&DR summary: Use switches that support OpenFlow 1.3.
Ronald Bartels created an interesting network troubleshooting checklist that covers numerous aspects of the troubleshooting process, from information gathered during problem reporting phase to timelines, investigation activities, device and port checks ... Feedback highly welcome!
Another day, another stateful debate, this time centered on the number of flows per hypervisor. Previously I guestimated 2.500 connections-per-second-per-(user-facing)gigabit and 37.500 concurrent sessions per user-facing gigabit, but wanted to align my numbers with reality before reaching any conclusions.
My web sites are way too small, so I asked a few of my friends to help me get more realistic figures.
Jason Edelman wrote a great blog post after watching Ethan Banks struggle with yet another multi-vendor IPsec deployment. Some of his ideas make perfect sense (wiki-like web site documenting working configurations between vendor X and Y for every possible X and Y), others less so (tunnel broker – particularly in view of recent Tor challenges), but let’s step back a bit and ask ourselves “Why is IPsec so complex?”
OpenFlow is a simple TCAM programming protocol, and can be used to implement any network forwarding paradigm as long as:
- OpenFlow specifications include matches and actions (including rewrites) of the packet header fields used in the forwarding paradigm. For example, you cannot program SRv6 tunnels with OpenFlow because it’s not part of OpenFlow standard.
- The forwarding hardware you want to use supports the OpenFlow matches and actions you need in your forwarding paradigm.
- The forwarding paradigm does not use dynamic interfaces (example: MPLS-TE tunnels) or multipoint tunnel interfaces (example: VXLAN). OpenFlow was designed to be used on point-to-point physical interfaces and does not include interface management.
This blog post describes some of the more common OpenFlow use cases (assuming you want to use an obsolete rarely-implemented protocol).
A while ago I wrote about ATAoE and why I think a layer-2-only TFTP-like protocol shouldn’t be used these days. As always, the answer to that black-and-white opinion (and I’m full of them) is “it depends” – ATAoE works great if you do it right.
It all started with a message from one of my Twitter friends: “how on Earth do you find the time to blog so often?” Here’s the secret recipe: a happy little thought and a bit of fairy dust. No, got it wrong, that helps you fly. The real secret ingredients: time, process, ideas, and a pinch of motivation.
One of the holy grails of data center SDN evangelists is controller-driven traffic engineering (throwing more leaf-and-spine bandwidth at the problem might be cheaper, but definitely not sexier). Obviously they don’t call it traffic engineering as they don’t want to scare their audience with MPLS TE nightmares, but the idea is the same.
Interestingly, you don’t need new technologies to get as close to that holy grail as you wish; Petr Lapukhov got there with a 20 year old technology – BGP.
TL&DR Summary: Yes (if you’re clumsy enough).
A while ago I read Impact of Graceful IGP Operations on BGP – an article that described how changes in IGP topology result in temporary (or sometimes even permanent) forwarding loops in networks using BGP route reflectors.
Is the problem real? Yes, it is. Could you generate a BGP RR topology that results in a permanent forwarding loop? Yes. It’s not that hard.
Tassos opened an interesting can of worms in a comment to my Management, Control and Data Planes post: Is ICMP response to a forwarded packet (TTL exceeded, fragmentation needed or destination unreachable) a control- or data-plane activity?
My keynote speech @ PLNOG11 conference was focused on (surprise, surprise) overlay virtual networks and described the usual motley crew: The Annoying Problem, The Hated VLAN, The Overlay Unicorn, The Control-Plane Wisdom and The Ever-Skeptic Use Case. You can view the presentation on my web site; PLNOG organizers promised video recording in mid-October.
After we get rid of the QoS FUD, the next question I usually get when discussing overlay networks is “how should these networks treat IP TTL?”
As (almost) always, the answer is “It depends.”
OpenStack seems to have a great architecture: all device-specific code is abstracted into plugins that have a well-defined API, allowing numerous (more or less innovative) implementations under the same umbrella orchestration system.
Looks great in PowerPoint, but to an uninitiated outsider looking at the network (Quantum, now Neutron) plugin through the lenses of OpenStack Neutron documentation, it looks like it was designed by either a vendor or a server-focused engineer using NIC device driver concepts.