Dear VMware, BPDU Filter != BPDU Guard

A while ago I described the need for BPDU guard in hypervisor switches, and not surprisingly got a number of “it’s there” tweets seconds after vSphere 5.1 (which includes BPDU guard) was launched. Rickard Nobel also did a magnificent job of replicating the problem my blog post is describing and verifying vSphere 5.1 stops a BPDU denial-of-service attack.

Unfortunately, BPDU filter is not the same feature as BPDU guard. Here’s why.

Imagine a happily-running simple network with two switches, two hypervisors and two VMs belonging to the same VLAN:

Now, for whatever weird reason the VM administrator decides to configure a VPN (or GRE) tunnel between the VMs and enables bridging between the Ethernet interface and the tunnel on both VMs.

Actions like this are usually caused by Monte Carlo approach to device configuration: trying every combination of GUI-accessible features till one of them appears to be working. The Heisenbergian properties of this approach can be greatly enhanced by throwing random results of Google search at the problem.

Unless the VM administrator manages to mangle all the intelligence built into the VM protocol stack (and there are always ways of doing that – see DisableSTA in Windows registry), the VMs configured as bridges start sending BPDUs through their physical interface, and any properly configured switch shuts down the offending port, preventing a forwarding loop ... and hosing the hypervisor host and all its VMs as a collateral damage.

A malicious tenant could misuse BPDU guard for a BPDU-based denial-of-service attack (details in Rickard Nobel’s blog post), and VMware decided to prevent that by implementing BPDU filter (Net.BlockGuestBPDU variable) in its vSwitch. BPDU filter definitely prevents DoS attacks ... but it also destroys any chance of ever detecting a forwarding loop.

While you can prevent bridging-induced forwarding loops with the combination of BPDU filter and reject forged transmits (described in more details in my original blog post), you’re still avoiding the symptoms, not fixing the problem. Any VM doing unauthorized bridging should be immediately disconnected from the vSwitch – which just might prompt its administrator to correct the faulty settings.

Instead of that, VMware unfortunately decided to go down the familiar never ever disturb the VM, let’s just pretend everything is OK route (and it’s definitely easier to implement drop packets with SNAP value 0x010B code than shut down the offending interface and log the event one).

Summary: vSwitch BPDU filter is a great step in the right direction, but we still need the solution (BPDU Guard) not a band-aid combo. Oh, and did I mention that neither BPDU filter nor reject forged transmits are enabled by default?

More information

To learn more about VMware networking, watch the recoding of my VMware Networking Deep Dive webinar (also available as part of the yearly subscription).

7 comments:

  1. It's almost like.. they don't have a clue about about basic L2 networking. But, hey, what do we know about networking in the 21st century... we're dinosaurs.

    ReplyDelete
  2. Just looking to clarify something here....

    would have to say a rather artificial situation, GRE tunnel between 2 VMs and enable bridging, really? Would hazard one is far more likely to get run over in a car park, charged by a rhino or suffer an outage due to a bug or hardware failure.

    But still engineering resources should be allocated to this? Thereby allowing customers (or attackers) to turn on a feature which could be used to great effect against an organisation.

    Just imagine - Cloud provider x gets a demand for millions otherwise I take down your DC. A single feature could be used to great effect. Just like RRs.

    ReplyDelete
    Replies
    1. Dear Anonymous!

      Hope you've vented your frustration with my blog post. Your valuable thoughts just might have more credence should you have decided to share your name and professional background with us.

      The situation is not as artificial as it might seem to you; I've seen bridging enabled between two server interfaces at least three times in real life (at least one of them involving a VPN tunnel), twice with catastrophic meltdowns, third time BPDU guard kicked in and hosed the whole ESXi server. As for car parks, fortunately I haven't seen anyone getting run over in a car park, so my small insignificant sample differs from yours.

      As for engineering resources - obviously VMware considered this a big enough problem to devote some resources on the programming and technical documentation side (btw, the technical marketing documents explaining this feature are great), unfortunately they decided to implement the suboptimal feature.

      Delete
    2. Anonymous, Human errors are more common than intentional attacks against Layer 2 networks in an Enterprise. And many times, with networks, the mistakes will occur in creating accidental physical loops. Accidentally configuring a server for bridging instead of teaming.

      Or having a server that is /known/ to be bridging, and then, a later change elsewhere in the physical network topology has an execution error or unforseen consequence that results in a loop.

      If VMware wants to play around with blocking or manipulating BPDUs in any manner that a dumb switch would not, then they need to take some responsibility for how this impacts the reliability of the whole network, and provide responsible knobs -- such as "Per VM" configurability, and enforcement options of "Disable VM Virtual NIC and Send E-mail" instead of just "block packet".

      It would also be nice if they could provide more advanced security policies and extend that to forged VTP, CDP, DHCP, DHCP snooping, and ARP traffic / dynamic ARP inspect as well.


      Without the Net.BlockGuestBPDU feature, an ESXi vSwitch was not a greater risk to the network, than any other "dumb" switch that does not support STP.

      With the BlockGuestBPDU feature enabled, the ESXi vSwitch is a much greater risk of being involved in a loop than any 'dumb' switch that does not support STP, because dumb switches that don't support STP don't block BPDUs.

      This minor feature add does not change the vSwitch from being a "dumb" switch to a managed switch, and filtering STPs on a dumb unmanageable switch is absolutely something you don't want.

      Switches that support STP implement the loop avoidance protocol.

      Blocking BPDUs from being forwarded as the new filtering procedure does, breaks 802.1D bridging compliance of any software bridge in a VM.

      Meaning a loop would be catastrophic, instead of properly handled by the rest of the switching infrastructure with STP support.

      If a VM software bridge is intentional, it reasons that 'Reject forged transmits' will be off by design, but a loop can still occur.



      Delete
  3. Background is a couple certifications in disparate technologies (no point in having 6 CCIEs in overlapping technologies in my book), a few years as an embedded coder and around 20 years industry experience across many industry sectors and technologies.

    I read, some years back, the various books you were involved with and was impressed by their content.

    Recently decided to start looking round your blog site as I had a little time on my hands, anticipating a similar type of quality and depth of thought, which I am sure you pride yourself in. What I see is some good stuff mixed in with what can only be described as a lot of noise, scare mongering and sensationalizing.

    Your blog is not doing your intellect and experience justice, in my humble opinion.

    ReplyDelete
    Replies
    1. Fair enough, thank you. Let's just say that there's a bit of history behind quite a few blog posts (contact me offline if you'd like to get more information). Also, I try not to make noise based on random academic fantasies - I wrote the "We need BPDU Guard" blog post after I experienced the consequences of not having one.

      Delete
  4. Anonymous.. Please read rfc 1925...
    #4... Read it 5 times..

    I'm still shocked every time I read it how much wisdom here is in this old RFC.

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.