Network Digital Twins: Between PowerPoint and Reality

A Network Artist left an interesting remark on one of my blog posts:

It’s kind of confusing sometimes to see the digital twin (being a really good idea) never really take off.

His remark prompted me to resurface a two-year-old draft listing a bunch of minor annoyances that make Networking Digital Twins more of a PowerPoint project than a reality.

Let’s start with the easy ones:

  • Since Dynamips ran out of platforms to emulate, I haven’t seen a virtual machine (or container) that supports anything other than Ethernet interfaces. That might not matter in 2025, but if you happen to have any other technology in your network, it’s an immediate showstopper.
  • Network operating systems packaged as virtual machines often have different interface names than the real hardware. The only vendors I’ve seen dealing with that were Cumulus Linux, which could (being based on Linux) simply rename the devices (Ethernet ports), and Arista EOS, where you can specify the mapping of Linux interfaces into Arista EOS interface names with a JSON configuration file.
  • Very few virtual machines that emulate chassis switches allow you to specify the line cards you want to use. The only exception I’m aware of is Nokia SR Linux and SR-OS.
  • Virtual machines usually have a limited number of interfaces, whether due to VM limitations or limitations of the virtualization infrastructure. That could make it impossible to reliably emulate large core switches.
  • RAM and CPU requirements: Some virtual machines emulating bloated network devices require 4+ CPU cores and 16+ GB of RAM. On the other hand, apart from Clabernetes, I haven’t seen any serious effort to build a Digital Twin Infrastructure that would be able to deploy the workload on a server cluster. It must be great fun building a server that can emulate a large Nexus OS fabric.

But wait, we just got started. There’s the tiny detail of data plane emulation. I heard of a single company (NVIDIA) that claimed they’re trying to emulate their ASICs in virtual machines.

Anyway:

  • Virtual data plane functionality often doesn’t match the ASIC behavior (more details, examples). Even worse, some virtual machines cannot deal with basic features like interface bandwidth. I don’t want to know how reliable QoS emulation is on platforms that do QoS in hardware.
  • Printouts related to the data plane functionality probably don’t match between virtual machines and physical hardware, making it impossible to test any network automation solution that relies on inspecting hardware details.
  • Some control-plane protocols might not work as expected (I had problems with some BFD implementations)
  • Control-plane protection might not work (I’m not brave enough to try it out)
  • I never tested the complex failover functionality (such as TI-LFA), but I wouldn’t be surprised to find quirks.

Long story short: It looks like most vendors decided the primary use case of the VM versions of their network devices is kicking the tires and getting familiar with the platform. That’s awesome, and I can’t tell you how important it is for someone evaluating a new platform to gain some hands-on experience with it. However, there’s a very long way between this use case and a reliable (and thus useful) digital twin.

I hope the reality is not as bleak as what I can see from here; should that be the case, please leave a comment.

Finally, let’s address the pair of elephants that was patiently waiting in the corner of the room:

The generic I can use a digital twin to test changes in my network idea is unfortunately as sound as I can move my VM around the world to minimize the latency for currently-active users. Both of them look great in PowerPoint, but match reality as closely as a spherical cow in a vacuum.

Revision History

2025-06-19
  • Stefan de Kooter submitted a PR pointing out that you can specify the emulated hardware configuration on SR-OS
  • Charles Monson pointed out Arista’s interface mapping capability.

5 comments:

  1. Arista also has a method for renaming interfaces (the documentation appears to be behind a login, but search "Arbitrary Interface Mapping on vEOS & cEOS").

    It does help with testing some things, but with the differences between software and hardware forwarding even if it accepts all the commands a 'digital twin' that actually validates behavior will continue to be a pipe dream.

    Replies
    1. You're absolutely right, and we're using that functionality in netlab. Have to fix the blog post.

  2. I agree: building a full network twin is often unrealistic.

    Still, I’d love to get your take on Google’s approach in its Autonomous Network Operations Framework. From what I understand, their “digital twin” is essentially a data model built from configs, telemetry, and other network data. They use it to verify intent and predict issues without simulating the control plane at all. Did I get that right, and what’s your perspective on that approach?

  3. I built a rather cumbersome set of scripts to create a containerlab topology from our Netbox instance to simulate our production network. Based on Linux containers and crpd instances so resource usage is reasonable.

    Once spun up I could run our normal automation against the lab instances to get them configured. Took some fake dns and a bit of messing with the interface names but it works.

    Obviously it can't simulate everything, but I get routing tables that exactly match what we see in production, which allows me to tinker with protocols, policies, link costs or whatever and see exactly what the results would be. A lot of work and the scope is obviously limited, but it's been very useful for us.

    Replies
    1. So glad to hear it worked for you. You had a perfect use case -- a reasonably sized device instance (cRPD) and a focus on routing protocols.

  4. Well. I think another need for digital twins is performance optimization. In practice we usually build smaller versions of the system and then increase. Arguably it could be better to design the system in a virtual environment and try to perform some optimizations before the real system exists...

    Replies
    1. What you're mentioning is a completely different use case -- it's not recreating an existing network but more like a "proof of concept". A virtual environment is the perfect place to do that, unless you require a realistic data-plane implementation or performance.

  5. It's interesting we have fully functional digital flight simulators for modern aircrafts, but networking has been left behind significantly in this direction.

    Is the underlying problem with the tech or intent ?

    Obviously vendors won't care much since most environments are multi-vendor or unless you are a vendor specific house.

    While Veriflow got taken over by VMware/Broadcom long time ago and didn't hear anything about them ever since, would you like to share you thoughts on Forward Networks and IP Fabric in this area please ?

Add comment
Sidebar