Using netlab to Argue with Vendor TAC

A happy netlab user sent me an unexpected use case: they successfully used its multi-vendor capabilities to argue with a vendor TAC. Here’s the gist of the story (edited/anonymized for obvious reasons):

They deployed a configuration change that resulted in an unexpected outage. The outage partially disrupted the data center network, so they didn’t have the luxury of collecting data and reproducing the issue, as they had to roll back the change as expeditiously as possible.

So far so good. These things happen. Unfortunately, the next step turned out to be a TAC hell loop.

A TAC-loop ensued where they kept sending the vendor TAC drawings, show command outputs illustrating the problem, configs, and explanations, but the TAC engineers kept asking for more data. Of course, tech support dumps being amongst that. One of them even persisted in calling the situation ‘works as expected.’

They were probably also asked to upgrade the software and reload the boxes 🤦‍♂️. I’ve been in similar situations several times, but in those days, we didn’t have virtual network devices. Now we do ;)

They subsequently recreated the physical topology in a netlab lab with the minimum of devices. Their setup was simple enough that they could use the standard netlab functionality to recreate the bug, and as expected, Arista cEOS and FRR didn’t cause an outage. Grumpily (so the user), they jumped through the hoops to netlab-up with vendor devices. Lo and behold, the outage could be reproduced.

Instead of continuing the never-ending TAC discussions, they sent the vendor TAC team the netlab topology file and asked them to reproduce the issue on their own.

Note: jumping through the hoops was caused by netlab failing to install libvirt on a Ubuntu 25.04 VM with nested virtualization. The root cause might have been the netlab installation script (now fixed) that was never tested on Ubuntu versions newer than 24.04.

Anyway, two weeks later, I received a wonderful follow-up message:

After sending vendor TAC the netlab topology.yml, they had to admit the issue, and they have escalated it to the engineers working on the software :-) There is some talk about an esoteric/hidden nerd knob 🤢, but the main point is that netlab helped to return the TAC ping-pong conversation to something productive!

Have you used netlab in an interesting or unexpected way? Please let me know.

Add comment
Sidebar