virtualization « ipSpace.net blog

Thursday, June 5, 2025 07:23 +0200

Weird: Ports on Linux Bridge Are Stuck

Just when you thought you got used to the weirdnesses in the networking implementations, you get a curveball like this one. Life is never dull if you test network devices.

Before releasing netlab release 2.0, I ran the full suite of integration tests for all devices for which I have the images. Interestingly, most VXLAN tests failed for Cumulus Linux 4.x even though we haven’t touched that code for ages.

Next step: trying to figure out what changed. The configuration changes were minimal. Even worse, the failure was non-deterministic. Somehow, we managed to transform a Cumulus Linux 4.x VM into a Heisenberg switch.

read more add comment

Monday, March 17, 2025 09:10 +0100

Arista EOS Spooky Action at a Distance

This blog post describes yet another bizarre behavior discovered during the netlab integration testing.

It started innocently enough: I was working on the VRRP integration test and wanted to use Arista EOS as the second (probe) device in the VRRP cluster because it produces nice JSON-formatted results that are easy to use in validation tests.

Everything looked great until I ran the test on all platforms on which netlab configures VRRP, and all of them passed apart from Arista EOS (that was before we figured out how Sturgeon’s Law applies to VRRPv3) – a “That’s funny” moment that was directly responsible for me wasting a few hours chasing white rabbits down this trail.

The Linux Bridge MTU Hell

It all started with an innocuous article describing the MTU basics. As the real purpose of the MTU is to prevent packet drops due to fixed-size receiver buffers, and I ~~waste~~ spend most of my time in virtual labs, I wanted to check how various virtual network devices react to incoming oversized packets.

As the first step, I created a simple netlab topology in which a single link had a slightly larger than usual MTU… and then all hell broke loose.

read more see 2 comments

Tuesday, March 4, 2025 07:43 +0100

Capturing Traffic in Virtual Networking Labs

When I announced the Stub Networks in Virtual Labs blog post on LinkedIn, I claimed it was the last chapter in the “links in virtual labs” saga. I was wrong; here comes the fourth part of the virtual links trilogy – capturing “on the wire” traffic in virtual networking labs.

While network devices provide traffic capture capabilities (usually tcpdump in disguise generating a .pcap file), it’s often better to capture the traffic outside of the device to see what the root cause of the problems you’re experiencing might be.

read more see 2 comments

Tuesday, February 25, 2025 07:55 +0100

Stub Networks in Virtual Labs

The previous blog posts described how virtualization products create LAN segments and point-to-point links.

However, sometimes we need stub segments – segments connected to a single router or switch – because we don’t want to waste resources creating hosts attached to a network device, but would still prefer a more realistic mechanism than static routes to inject IP subnets into routing protocols.

read more add comment

Wednesday, February 12, 2025 07:55 +0100

Point-to-Point Links in Virtual Labs

In the previous blog post, I described the usual mechanisms used to connect virtual machines or containers in a virtual lab, and the drawbacks of using Linux bridges to connect virtual network devices.

In this blog post, we’ll see how KVM/QEMU/libvirt/Vagrant use UDP tunnels to connect virtual machines, and how containerlab creates point-to-point vEth links between Linux containers.

read more add comment

Monday, February 3, 2025 08:27 +0100

Links in Virtual Labs

There are three major ways to connect network devices in the physical world:

Point-to-point links between devices (usually using some variant of Ethernet)
Multi-access layer-1 networks running some IEEE 802.x encapsulation on top of that (GPON, WiFi, Ethernet hubs)
Multi-access switched layer-2 network (dumb switches, hopefully running some STP variant)

Implementing these connections in virtual labs is a bit harder than one might think, as all virtualization solutions assume you plan to run virtual servers connected to Ethernet segments.

read more add comment

Monday, May 6, 2024 08:25 +0200… updated on Thursday, February 20, 2025 09:52 +0100

Famous Last Words: I'm Too Stupid for That

Some networking vendors realized that one way to gain mindshare is to make their network operating systems available as free-to-download containers or virtual machines. That’s the right way to go; I love their efforts and point out who went down that path whenever possible¹ (as well as others like Cisco who try to make our lives miserable).

However, those virtual machines better work out of the box, or you’ll get frustrated engineers who will give up and never touch your warez again, or as someone said in a LinkedIn comment to my blog post describing how Junos vPTX consistently rejects its DHCP-assigned IP address: “If I had encountered an issue like this before seeing Ivan’s post, I would have definitely concluded that I am doing it wrong.”²

read more see 2 comments

virtualization

Tuesday, December 13, 2022 07:11 UTC

DPU Hype Considered Harmful

The hype generated by the “VMware supports DPU offload” announcement already resulted in fascinating misunderstandings. Here’s what I got from a System Architect:

We are dealing with an interesting scenario where a customer had limited data center space, but applications demand more resources. We are evaluating whether we could offload ESXi processing to DPUs (Pensando) to use existing servers as bare-metal servers. Would it be a use case for DPU?

First of all, congratulations to whichever vendor marketer managed to put that guy in that state of mind. Well done, sir, well done. Now for a dose of reality.

Are DPUs Any Good?

After VMware launched DPU-based acceleration for VMware NSX, marketing-focused websites frantically started discussing the benefits of DPUs. Although I’ve been writing about SmartNICs and DPUs for years, it’s time for another closer look at the emperor’s clothes.

What Is a DPU

DPU (Data Processing Unit) is a fancier name for a network adapter formerly known as SmartNIC – a server repackaged into an interface card form factor. We had them for decades (anyone remembers iSCSI offload adapters?)

read more add comment

Wednesday, April 13, 2022 06:42 UTC

AWS Automatic EC2 Instance Recovery

On March 30th 2022, AWS announced automatic recovery of EC2 instances. Does that mean that AWS got feature-parity with VMware High Availability, or that VMware got it right from the very start? No and No.

Automatic Instance Recover Is Not High Availability

Reading the AWS documentation (as opposed to the feature announcement) quickly reveals a caveat or two. The automatic recovery is performed if an instance becomes impaired because of an underlying hardware failure or a problem that requires AWS involvement to repair.

Worth Reading: VMware Operations Guide

Iwan Rahabok’s open-source VMware Operations Guide is now also available in Markdown-on-GitHub format. Networking engineers support vSphere/NSX infrastructure might be particularly interested in the Network Metrics chapter.

add comment

Wednesday, February 16, 2022 09:03 UTC

Running BGP between Virtual Machines and Data Center Fabric

Got this question from one of my readers:

When adopting the BGP on the VM model (say, a Kubernetes worker node on top of vSphere or KVM or Openstack), how do you deal with VM migration to another host (same data center, of course) for maintenance purposes? Do you keep peering with the old ToR even after the migration, or do you use some BGP trickery to allow the VM to peer with whatever ToR it’s closest to?

Short answer: you don’t.

Kubernetes was designed in a way that made worker nodes expendable. The Kubernetes cluster (and all properly designed applications) should recover automatically after a worker node restart. From the purely academic perspective, there’s no reason to migrate VMs running Kubernetes.

read more see 2 comments

Thursday, January 27, 2022 09:34 UTC

MTU Settings in Virtual Network Devices

When I finally¹ managed to get SR Linux running with netlab, I wanted to test how it interacts with Cumulus VX and FRR in an OSPF+BGP lab… and failed. Jeroen Van Bemmel quickly identified the culprit: MTU. Yeah, it’s always the MTU (or DNS, or BGP).

I never experienced a similar problem, so of course I had to identify the root cause:

Worth Reading: Xen on AWS Nitro NICs

If you find smart NICs interesting, you’ll like the latest blog post by James Hamilton explaining how AWS emulated Xen environment on Nitro hardware to keep old VM instances running on new hardware.

see 2 comments

Category: virtualization

What Is a DPU

Automatic Instance Recover Is Not High Availability