Category: switching
Forwarding State Abstraction with Tunneling and Labeling
Yesterday I described how the limited flow setup rates offered by most commercially-available switches force the developers of production-grade OpenFlow controllers to drop the microflow ideas and focus on state abstraction (people living in a dreamland usually go in a totally opposite direction). Before going into OpenFlow-specific details, let’s review the existing forwarding state abstraction technologies.
FIB Update Challenges in OpenFlow Networks
Last week I described the problems high-end service provider routers (or layer-3 switches if you prefer that terminology) face when they have to update large number of entries in the forwarding tables (FIBs). Will these problems go away when we introduce OpenFlow into our networks? Absolutely not, OpenFlow is just another mechanism to download forwarding entries (this time from an external controller) not a laws-of-physics-changing miracle.
Which virtual networking technology should I use?
After I published the Decouple virtual networking from the physical world article, @paulgear1 sent me a very valid tweet: “You seemed a little short on suggestions about the path forward. What should customers do right now?” Apart from the obvious “it depends”, these are the typical use cases (as I understand them today – please feel free to correct me).
Decouple virtual networking from the physical world
Isn’t it amazing that we can build the Internet, run the same web-based application on thousands of servers, give millions of people access to cloud services … and stumble badly every time we’re designing virtual networks. No surprise, by trying to keep vSwitches simple (and their R&D and support costs low), the virtualization vendors violate one of the basic scalability principles: complexity belongs to the network edge.
… updated on Monday, May 20, 2024 17:58 +0200
VM-aware Networking Improves IaaS Cloud Scalability
In the VMware vSwitch – the baseline of simplicity post I described simple layer-2 switches offered by most hypervisor vendors and the scalability challenges you face when trying to build large-scale solutions with them. You can solve at least one of the scalability issues: VM-aware networking solutions available from most data center networking vendors dynamically adjust the list of VLANs on server-to-switch links.
VMware vSwitch – the baseline of simplicity
If you’re looking for a simple virtual switch, look no further than VMware’s venerable vSwitch. It runs very few control protocols (just CDP or LLDP, no STP or LACP), has no dynamic MAC learning, and only a few knobs and moving parts – ideal for simple deployments. Of course you have to pay for all that ease-of-use: designing a scalable vSwitch-based solution is tough (but then it all depends on what kind of environment you’re building).
Nexus vPC and Consistency Checker
Michel sent me a detailed e-mail describing both his enthusiasm with vPC and the headaches consistency checker is causing him. Here’s the good part:
Nexus vPC seems like a perfect solution for real multi-chassis etherchannel. At work we're using it extensively on a few pairs of Nexus 7000's.
... and then it turns sour:
However, there is one MAJOR drawback with vPC at this time, it's the way the consistency checker works (or rather, does not work). We've come across two specific situations where consistency checker will bring down your beautiful and redundant vPC link, and we've found no way around.
Here are his problems:
Big Switch Networks might actually make sense
Big Switch Networks is one of those semi-stealthy startups that like to hint at what they’re doing without actually telling you anything, so I was very keen to meet Kyle Forster and Guido Appenzeller during the OpenFlow Symposium and asked them a simple question: “can you explain in 3 minutes what it is you’re doing?”
L2 or L3 switching in campus networks?
Michael sent me an interesting question:
I work in a rather large enterprise facing a campus network redesign. I am in favor of using a routed access for floor LANs, and make Ethernet segments rather small (L3 switching on access devices). My colleagues seem to like L2 switching to VSS (distribution layer for the floor LANs). OSPF is in use currently in the backbone as the sole routing protocol. So basically I need some additional pros and cons for VSS vs Routed Access. :-)
The follow-up questions confirmed he has L3-capable switches in the access layer connected with redundant links to a pair of Cat6500s:
MPLS is not tunneling
Greg (@etherealmind) Ferro started an interesting discussion on Google+, claiming MPLS is just tunneling and a duct tape like NAT. I would be the first one to admit MPLS has its complexities (not many ;) and shortcomings (a few ;), but calling it a tunnel just confuses the innocents. MPLS is not tunneling, it’s a virtual-circuits-based technology, and the difference between the two is a major one.
VXLAN termination on physical devices
Every time I’m discussing the VXLAN technology with a fellow networking engineer, I inevitably get the question “how will I connect this to the outside world?” Let’s assume you want to build pretty typical 3-tier application architecture (next diagram) using VXLAN-based virtual subnets and you already have firewalls and load balancers – can you use them?
The product information in this blog post is outdated - Arista, Brocade, Cisco, Dell, F5, HP and Juniper are all shipping hardware VXLAN gateways (this post has more up-to-date information). The concepts explained in the following text are still valid; however, I would encourage you to read other VXLAN-related posts on this web site or watch the VXLAN webinar to get a more recent picture.
NVGRE – because one standard just wouldn’t be enough
Two weeks after VXLAN (backed by VMware, Cisco, Citrix and Red Hat) was launched at VMworld, Microsoft, Intel, HP & Dell published NVGRE draft (Arista and Broadcom are cleverly sitting on both chairs) which solves the same problem in a slightly different way.
If you’re still wondering why we need VXLAN and NVGRE, read my VXLAN post (and the one describing how VXLAN, OTV and LISP fit together).
It’s obvious the NVGRE draft was a rushed affair, its only significant and original contribution to knowledge is the idea of using the lower 24 bits of the GRE key field to indicate the Tenant Network Identifier (but then, lesser ideas have been patented time and again). Like with VXLAN, most of the real problems are handwaved to other or future drafts.
You Don’t Need OpenFlow to Solve Every Age-Old Problem
I read two great blog posts on Sunday: evergreen Fallacies of Distributed Computing from Bob Plankers and forward-looking Understanding Hadoop Clusters and the Network from Brad Hedlund. Read them both before continuing (they are both great reads) and try to figure out why I’m mentioning them in the same sentence (no, it’s not the fact that Hadoop uses distributed computing).
Long-distance IRF Fabric: Works Best in PowerPoint
HP has commissioned an IRF network test that came to absolutely astonishing conclusions: vMotion runs almost twice as fast across two links bundled in a port channel than across a single link (with the other one being blocked by STP). The test report contains one other gem, this one a result of incredible creativity of HP marketing:
For disaster recovery, switches within an IRF domain can be deployed across multiple data centers. According to HP, a single IRF domain can link switches up to 70 kilometers (43.5 miles) apart.
You know my opinions about stretched cluster… and the more down-to-earth part of HP Networking (the people writing the documentation) agrees with me.
Nexus 1000V LACP offload and the dangers of in-band control
A while ago someone sent me the following comment as part of a lengthy discussion focusing on Nexus 1000V: “My SE tells me that the latest 1000V release has rewritten the LACP code so that it operates entirely within the VEM. VSM will be out of the picture for LACP negotiations. I guess there have been problems.”
If you’re not convinced you should be running LACP between the ESX hosts and the physical switches, read this one (and this one). Ready? Let’s go.