An engineer attending my VMware Networking Deep Dive webinar has asked me a tough question that I was unable to answer:
What happens if a VM running within a vSphere host sends a BPDU? Will it get dropped by the vSwitch or will it be sent to the physical switch (potentially triggering BPDU guard)?
I got the answer from visibly harassed Kurt (@networkjanitor) Bales during the Networking Tech Field Day; one of his customers has managed to do just that.
Update 2011-11-04: The post was rewritten based on extensive feedback from Cisco, VMware and numerous readers.
Here’s a sketchy overview of what was going on: they were running a Windows VM inside his VMware infrastructure, decided to configure bridging between a vNIC and a VPN link, and the VM started to send BPDUs through the vNIC. vSwitch ignored them, but the physical switch didn’t – it shut down the port, cutting a number of VMs off the network.
Best case, BPDU guard on the physical switch blocks but doesn’t shut down the port – all VMs pinned to that link get blackholed, but the damage stops there. More often BPDU guard shuts down the physical port (the reaction of BPDU guard is vendor/switch-specific), VMs using that port get pinned to another port, and the misconfigured VM triggers BPDU guard on yet another port, until the whole vSphere host is cut off from the rest of the network. Absolutely worst case, you’re running VMware High Availability, the vSphere host triggers isolation response, and the offending VM is restarted on another host (eventually bringing down the whole cluster).
There is only one good solution to this problem: implement BPDU guard on the virtual switch. Unfortunately, no virtual switch running in VMware environment implements BPDU guard.
Interestingly, the same functionality would be pretty easy to implement in Xen/XenServer/KVM; either by modifying the open-source Open vSwitch or by forwarding the BPDU frames to OpenFlow controller, which would then block the offending port.
Nexus 1000V seems to offer a viable alternative. It has an implicit BPDU filter (you cannot configure it) that would block the BPDUs coming from a VM, but that only hides the problem – you could still get forwarding loops if a VM bridges between two vNICs connected to the same LAN. However, you can reject forged transmits (source-MAC-based filter, a standard vSphere feature) to block bridged packets coming from a VM.
According to VMware’s documentation (ESX Configuration Guide) the default setting in vSphere 4.1 and vSphere 5 is to accept forged transmits.
Lacking Nexus 1000V, you can use a virtual firewall (example: vShield App) that can filter layer-2 packets based on ethertype. Yet again, you should combine that with rejection of forged transmits.
No BPDUs past this point
In theory, an interesting approach might be to use VM-FEX. A VM using VM-FEX is connected directly to a logical interface in the upstream switch and the BPDU guard configured on that interface would shut down just the offending VM. Unfortunately, I can’t find a way to configure BPDU guard in UCS Manager.
Other alternatives to BPDU guard in a vSwitch range from bad to worse:
Disable BPDU guard on the physical switch. You’ve just moved the problem from access to distribution layer (if you use BPDU guard there) ... or you’ve made the whole layer-2 domain totally unstable, as any VM could cause STP topology change.
Enable BPDU filter on the physical switch. Even worse – if someone actually manages to configure bridging between two vNICs (or two physical NICs in a Hyper-V host), you’re toast; BPDU filter causes the physical switch to pretend the problem doesn’t exist.
Enable BPDU filter on the physical switch and reject forged transmits in vSwitch. This one protects against bridging within a VM, but not against physical server misconfiguration. If you’re absolutely utterly positive all your physical servers are vSphere hosts, you can use this approach (vSwitch has built-in loop prevention); if there’s a minimal chance someone might connect bare-metal server or a Hyper-V/XenServer host to your network, don’t even think about using BPDU filter on the physical switch.
Anything else? Please describe your preferred solution in the comments!
BPDU filter available in Nexus 1000V or ethertype-based filters available in virtual firewalls can stop the BPDUs within the vSphere host (and thus protect the physical links). If you combine it with forged transmit rejection, you get a solution that protects the network from almost any VM-level misconfiguration.
However, I would still prefer a proper BPDU guard with logging in the virtual switches for several reasons:
- BPDU filter just masks the configuration problem;
- If the vSwitch accepts forged transmits, you could get an actual forwarding loop;
- While the solution described above does protect the network, it also makes troubleshooting a lot more obscure – a clear logging message in vCenter telling the virtualization admin that BPDU guard has shut down a vNIC would be way better;
- Last but definitely not least, someone just might decide to change the settings and accept forged transmits (with potentially disastrous results) while troubleshooting a customer problem @ 2AM.