iBGP Local-AS Route Propagation
In the previous blog post on this topic, I described the iBGP local-as functionality and explained why we MUST change the BGP next hop on the routes sent over the fake iBGP session (TL&DR: because we’re not running IGP across that link).
That blog post used a simple topology with three routers. Now let’s add a few more routers to the mix and see what happens.
Network Automation Reality Check with William Collins
In early August, William Collins invited me to chat about a sarcastic comment I made about a specific automation tool I have a love-hate relationship with on LinkedIn.
We quickly agreed not to go (too deep) into tool-bashing. Instead, we discussed the eternal problems of network automation, from unhealthy obsession with tools to focus on point solutions while lacking the bigger picture or believing in vendor-delivered nirvana.
SwiNOG 40: Application-Based Source Routing with SRv6
The we should give different applications different paths across the network idea never dies (even though in many places the residential Internet gives you enough bandwidth to watch 4K videos), and the Leveraging Intent-Based Networking and SRv6 for Dynamic End-to-End Traffic Steering (video) by Severin Dellsperger was an interesting new riff on that ancient grailhunt.
Their solution uses SRv6 for traffic steering1, an Intent-Based System2 that figures out paths across the network, and eBPF on client hosts3 to add per-application SRv6 headers to outgoing traffic.
EVPN Designs: Layer-3 Inter-AS Option A
A netlab user wanted to explore a multi-site design where every site runs an independent EVPN fabric, and the inter-site link is either a layer-2 or a layer-3 interconnect (DCI). Let’s start with the easiest scenario: a layer-3 DCI with a separate (virtual) link for every tenant (in the MPLS/VPN world, we’d call that Inter-AS Option A)

When Switches Flood LLDP Traffic
A networking engineer (let’s call him Joe1) sent me an interesting challenge: they built a data center network with Cisco switches, and the switches flood LLDP packets between servers.
That would be interesting by itself (the whole network would appear as a single hub), but they’re also using DCBX (which is riding in LLDP TLVs), and the DCBX parameters are negotiated between servers (not between servers and adjacent switches), sometimes resulting in NIC resets2.
ArubaCX Decides When You're Done Changing a BGP Routing Policy
When I was cleaning the “set BGP MED” integration test, I decided that once a BGP prefix is in the BGP table of the BGP peer, there’s no need for a further wait before checking its MED value. After all:
- We configure an outbound routing policy to change MED;
- We execute do clear bgp * soft out at the end of most BGP policy configuration templates1
- The device under test should thus immediately (re)send the expected BGP prefix with the target MED.
That approach failed miserably with ArubaCX; it was time to investigate the details.
Configuring BGP Community Propagation is Confusing
A large number of vendors claim to use industry-standard CLI, which means “something that looks like Cisco IOS, but we can’t say that in public.” The implementations of that “standard” are full of quirks; as I was making fun of Cisco IOS last week, it’s only fair to look at how others deal with BGP community propagation.
netlab has BGP configuration templates for 14 different platforms1, including these implementations that look like Cisco IOS from a distance if you squint just right2: Arista EOS, Aruba CX, and FRRouting. You can check the configuration templates if you wish; here’s the TC&DB3 overview:
SwiNOG 40: Trustworthy Network Automation
The SwiNOG 40 event started with an interesting presentation on Building Trustworthy Network Automation (video) by Damien Garros (now CEO @ OpsMill) who discussed the principles one can use to build a trustworthy network automation solution, including idempotency, dry runs, and transactional changes. He also covered the crucial roles of the declarative approach, version control, and testing.
If you have ever watched any of my network automation materials, you won’t be surprised by anything he said, but if you’re just starting your network automation journey, you MUST watch this presentation to get your bearings straight.
Fun Reading: AI: Great Expectations
Rodney Brooks republished an article on great AI expectations that he wrote 37 years ago. Not surprisingly, apart from a few technical details triggered by four decades of exponential growth in silicon capabilities, the article could have been written yesterday.
Side note: I’m a bit younger than Rodney, but I also went through at least three waves of AI hype cycles, starting with Prolog and 4GL, then expert systems, and finally neural networks. Around that time, I stopped caring and focused on networking, but I have enough battle scars to remain skeptical.
BGP Community Propagation on Cisco IOS/XE: The 90's Called
Just when I thought no vendor stupidity peculiarity could surprise me, Cisco IOS/XE proved me wrong.
I was improving a completely unrelated BGP functionality. I ran BGP integration tests on Cisco IOL (because it’s the fastest one to boot), and the BGP community propagation test failed. After verifying that I did not change the template and that the data structures had not changed, I checked the IOL release I was using.
Surprise 🎉🎉: the neighbor send-community configurations that worked since (at least) the IOS Classic release 15.x stopped working in Cisco IOS/XE release 17.16.01a.
MUST READ: Storage Devices and Latency
PlanetScale published a great article describing the high-level principles of how storage devices work and covering everything from tape drives to SSDs and network-attached storage — a must-read for anyone even remotely interested in how their data is stored.
Fun Reading: Who is LLM?
Is an LLM a stubborn donkey, a genie, or a slot machine (and why)? Find out in the Who is LLM? article by Martin Fowler.
ArubaCX: When BGP Soft Reconfiguration Becomes a No-Op
Changing an existing BGP routing policy is always tricky on platforms that apply line-by-line changes to device configurations (Cisco IOS and most other platforms claiming to have industry-standard CLI, with the notable exception of Arista EOS). The safest approach seems to be:
- Do not panic when the user makes changes to route maps and underlying filters (prefix lists, AS-path access lists, or community lists).
- Let the user decide when they’re done and process the BGP table with the new routing policy at that time.
Ultra Ethernet: Reinventing X.25
One should never trust the technical details published by the industry press, but assuming the Tomahawk Ultra puff piece isn’t too far off the mark, the new Broadcom ASIC (supposedly loosely based on emerging Ultra Ethernet specs):
- Uses Optimized Ethernet Header, replacing IP/UDP header with a 10-byte something (let’s call it session identifier)
- Makes Ethernet lossless with hop-by-hop retransmission/error recovery
- Uses credit-based flow control (the receiver continuously updates the sender about the amount of available space)
If you’re ancient enough, you might recognize #3 as part of Fibre Channel, #2 and #3 as part of IEEE 802.1 LLC2 (used by IBM to implement SNA over Token Ring and Ethernet), and all three as the fundamental ideas of X.25 that Broadcom obviously reinvented at 800 Gbps speeds, proving (yet again) RFC 1925 Rule 11.
Always Check Your Tests Against Faulty Inputs
A while ago, I published a blog post proudly describing the netlab integration test that should check for incorrect OSPF network types in netlab-generated device configurations. Almost immediately, Erik Auerswald pointed out that my test wouldn’t detect that error (it might detect other errors, though) as the OSPF network adjacency is always established even when the adjacent routers have mismatching OSPF network types.
I made one of the oldest testing mistakes: I checked whether my test would work under the correct conditions but not whether it would detect an incorrect condition.