Worth Reading: Use of HTTPS DNS Resource Records
Around 30 years after we got the first website, the powers that be realized it might make sense to put this is how you access a web server information (including its IPv4 and IPv6 address, and HTTP(S) support information) directly into DNS, using HTTPS Resource Records. It took us long enough 🤷♂️
The BGP Multi-Exit Discriminator (MED) Saga
Martijn Van Overbeek left this comment on my LinkedIn post announcing the BGP MED lab:
It might be fixed, but I can recall in the past that there was a lot of quirkiness in multi-vendor environments, especially in how different vendors use it and deal with the setting when the attribute does exist or does not have to exist.
TL&DR: He’s right. It has been fixed (mostly), but the nerd knobs never went away.
In case you’re wondering about the root cause, it was the vagueness of RFC 1771. Now for the full story ;)
BGP Labs: Set BGP Communities on Outgoing Updates
It’s hard to influence the behavior of someone with strong opinions (just ask any parent with a screaming toddler), and trying to persuade an upstream ISP not to send the traffic over a backup link is no exception – sometimes even AS path prepending is not a strong enough argument.
An easy solution to this problem was proposed in 1990s – what if we could attach some extra attributes (called communities just to confuse everyone) to BGP updates and use them to tell adjacent autonomous systems to lower their BGP local preference? You can practice doing that in the Attach BGP Communities to Outgoing BGP Updates lab exercise.
Can a Router Use the Default Route to Reach BGP Next Hops?
TL&DR: Yes.
Starting with RFC 4271, Route Resolvability Condition:
- A route without an outgoing interface is resolvable if its next hop is resolvable without recursively using the same route.
- A route with an outgoing interface is always considered resolvable.
- BGP routes can be resolved through routes with just a next hop or an outgoing interface.
Worth Reading: Network Automation with GitHub Actions
George Davitiani put together a lovely proof-of-concept using GitHub actions to deploy modified configurations to network devices. Even better, he documented the whole setup, and the way to reproduce it. I’m positive you’ll find a few ideas browsing through what he did.
Worth Reading: Going CCNP Emeritus
Daniel Teycheney decided not to renew his CCNP status and used this opportunity to publish his thoughts on IT certifications. Not surprisingly, I agree with most of the things he said, but I never put it in writing so succinctly.
Red Pill Warning: Reading his blog post might damage your rosy view of the networking industry. You’ve been warned ;)
Video: Language Models in AI/ML Landscape
In September 2023, Javier Antich extended the AI/ML in Networking webinar with a new section describing large language models (LLMs), starting with how do the LLMs fit into the AI/ML landscape?
BGP Labs: AS-Path Prepending
In the previous lab, you learned how to use BGP Multi-Exit Discriminator (MED) to influence incoming traffic flow. Unfortunately, MED works only with parallel links to the same network. In a typical Redundant Internet Connectivity scenario, you want to have links to two ISPs, so you need a bigger hammer: AS Path Prepending.
Why Do We Need BGP Identifiers?
A friend of mine sent me an interesting question along these lines:
We all know that in OSPF, the router ID is any 32-bit number, not necessarily an IP address of an interface. The only requirement is that it must be unique throughout the OSPF domain. However, I’ve always wondered what the role of BGP router ID is. RFC 4271 says it should be set to an IP address assigned to that BGP speaker, but where do we use it?
Also, he observed somewhat confusing behavior in the wild:
Take two routers and configure the same BGP identifier on both. Cisco IOS will not establish a session, while IOS XR and Junos will.
I decided to take the challenge and dug deep into the bowels of RFC 4271 and RFC 6286. Here’s what I brought back from that rabbit hole:
Is BGP TTL Security Any Good?
After checking what routers do when they receive a TCP SYN packet from an unknown source, I couldn’t resist checking how they cope with TCP SYN packets with too-low TTL when using TTL security, formally known as The Generalized TTL Security Mechanism (GTSM) defined in RFC 5082.
TL&DR: Not bad: most devices I managed to test did a decent job.
VXLAN/EVPN Layer-3 Handoff (L3Out) on Arista EOS
A while ago, I published a blog post describing how to establish a LAN/WAN L3 boundary in VXLAN/EVPN networks using Cisco NX-OS. At that time, I promised similar information for Arista EOS. Here it is, coming straight from Massimo Magnani. The useful part of what follows is his; all errors were introduced during my editing process.
In the cases I have dealt with so far, implementing the LAN-WAN boundary has the main benefit of limiting the churn blast radius to the local domain, trying to impact the remote ones as little as possible. To achieve that, we decided to go for a hierarchical solution where you create two domains, local (default) and remote, and maintain them as separate as possible.
Video: Outages Caused by Bugs in BGP Implementations
The previous BGP-related videos described how fat fingers and malicious actors cause Internet outages.
Today, we’ll focus on the impact of bugs in BGP implementations, from malformed AS paths to mishandled transitive attributes. The examples in the video are a few years old, but you can see similar things in the wild in 2023.
Worth Reading: Cloudflare Control Plane Outage
Cloudflare experienced a significant outage in early November 2023 and published a detailed post-mortem report. You should read the whole report; here are my CliffsNotes:
- Regardless of how much redundancy you have, sometimes all systems will fail at once. Having redundant systems decreases the probability of total failure but does not reduce it to zero.
- As your systems grow, they gather hidden- and circular dependencies.
- You won’t uncover those dependencies unless you run a full-blown disaster recovery test (not a fake one)
- If you don’t test your disaster recovery plan, it probably won’t work when needed.
Also (unrelated to Cloudflare outage):
BGP Labs: Using Multi-Exit Discriminator (MED)
In the previous labs, we used BGP weights and Local Preference to select the best link out of an autonomous system and thus change the outgoing traffic flow.
Most edge (end-customer) networks face a different problem – they want to influence the incoming traffic flow, and one of the tools they can use is BGP Multi-Exit Discriminator (MED).
Is Anyone Using netlab on Windows?
Tomas wants to start netlab with PowerShell, but it doesn’t work for him, and I don’t know anyone running netlab directly on Windows (I know people running it in a Ubuntu VM on Windows, but that’s a different story).
In theory, netlab (and Ansible) should work fine with Windows Subsystem for Linux. In practice, there’s often a gap between theory and practice – if you run netlab on Windows (probably using VirtualBox with Vagrant), I’d love to hear from you. Please leave a comment, email me, add a comment to Tomas’ GitHub issue, or fix the documentation and submit a PR. Thank you!