Worth Reading: Switching to IP fabrics
Namex, an Italian IXP, decided to replace their existing peering fabric with a fully automated leaf-and-spine fabric using VXLAN and EVPN running on Cumulus Linux.
They documented the design, deployment process, and automation scripts they developed in an extensive blog post that’s well worth reading. Enjoy ;)
Video: Cisco SD-WAN Policy Design
In the final video in his Cisco SD-WAN webinar, David Penaloza discusses site ID assignments and policy processing order.
A carefully planned site scheme and ordered list of policy entries will save you complications and headaches when deploying the SD-WAN solution.
Routing Protocols: Use the Best Tool for the Job
When I wrote about my sample OSPF+BGP hands-on lab on LinkedIn, someone couldn’t resist asking:
I’m still wondering why people use two routing protocols and do not have clean redistribution points or tunnels.
Ignoring for the moment the fact that he missed the point of the blog post (completely), the idea of “using tunnels or redistribution points instead of two routing protocols” hints at the potential applicability of RFC 1925 rule 4.
Unnumbered Ethernet Interfaces
Imagine an Internet Service Provider offering Ethernet-based Internet access (aka everyone using fiber access, excluding people believing in Russian dolls). If they know how to spell security, they might be nervous about connecting numerous customers to the same multi-access network, but it seems they have only two ways to solve this challenge:
- Use private VLANs with proxy ARP on the head-end router, forcing the customer-to-customer traffic to pass through layer-3 forwarding on the head-end router.
- Use a separate routed interface with each customer, wasting three-quarters of their available IPv4 address space.
Is there a third option? Can’t we pretend Ethernet works in almost the same way as dialup and use unnumbered IPv4 interfaces?
Single-Metric Unequal-Cost Multipathing Is Hard
A while ago, we discussed whether unequal-cost multipathing (UCMP) makes sense (TL&DR: rarely), and whether we could implement it in link-state routing protocols (TL&DR: yes). Even though we could modify OSPF or IS-IS to support UCMP, and Cisco IOS XR even implemented those changes (they are not exactly widely used), the results are… suboptimal.
Imagine a simple network with four nodes, three equal-bandwidth links, and a link that has half the bandwidth of the other three:
netsim-tools release 0.7: Cumulus VX, EIGRP, and BGP IPv6 AF
netsim-tools release 0.7 is published, bringing you the following goodies (including stuff published a week ago as release 0.6.3):
- Cumulus VX support on libvirt and virtualbox.
- EIGRP configuration module
- BGP IPv6 address family
- Controlled BGP community propagation
Other changes include:
Worth Reading: Azure Datacenter Switch Failures
Microsoft engineers published an analysis of switch failures in 130 Azure regions (review of the article, The Next Platform summary):
- A data center switch has a 2% chance of failing in 3 months (= less than 10% per year);
- ~60% of the failures are caused by hardware faults or power failures, another 17% are software bugs;
- 50% of failures lasted less than 6 minutes (obviously crashes or power glitches followed by a reboot).
- Switches running SONiC had lower failure rate than switches running vendor NOS on the same hardware. Looks like bloatware results in more bugs, and taking months to fix bugs results in more crashes. Who would have thought…
Worth Reading: Running BGP in Large-Scale Data Centers
Here’s one of the major differences between Facebook and Google: one of them publishes research papers with helpful and actionable information, the other uses publications as recruitment drive full of we’re so awesome but you have to trust us – we’re not sharing the crucial details.
Recent data point: Facebook published an interesting paper describing their data center BGP design. Absolutely worth reading.
Just in case you haven’t realized: Petr Lapukhov of the RFC 7938 fame moved from Microsoft to Facebook a few years ago. Coincidence? I think not.
Video: Kubernetes Principles
After answering the “why should I care about Kubernetes?” question, Stuart Charlton explained the Kubernetes principles you should keep in mind if you want to have a chance of understanding what’s going on.
Local TCP Anycast Is Really Hard
Pete Lumbis and Network Ninja mentioned an interesting Unequal-Cost Multipathing (UCMP) data center use case in their comments to my UCMP-related blog posts: anycast servers.
Here’s a typical scenario they mentioned: a bunch of servers, randomly connected to multiple leaf switches, is offering a service on the same IP address (that’s where anycast comes from).
Packet Forwarding and Routing over Unnumbered Interfaces
In the previous blog posts in this series, we explored whether we need addresses on point-to-point links (TL&DR: no), whether it’s better to have interface or node addresses (TL&DR: it depends), and why we got unnumbered IPv4 interfaces. Now let’s see how IP routing works over unnumbered interfaces.
The Challenge
A cursory look at an IP routing table (or at CCNA-level materials) tells you that the IP routing table contains prefixes and next hops, and that the next hops are IP addresses. How should that work over unnumbered interfaces, and what should we use for the next-hop IP address in that case?
Mythbusting: NFV Data Center Fabric Buffering Requirements
Every now and then I stumble upon an article or a comment explaining how Network Function Virtualization (NFV) introduces new data center fabric buffering requirements. Here’s a recent example:
For Telco/carrier Cloud environments, where NFVs (which are much slower than hardware SGW) get used a lot, latency is higher with a lot of jitter due to the nature of software and the varying link speeds, so DC-level near-zero buffer is not applicable.
It seems to me we’re dealing with another myth. Starting with the basics:
Feedback: Kubernetes Networking Deep Dive
Here’s what one of the engineers watching Stuart Charlton’s Kubernetes webinar wrote about it:
“Kubernetes Networking Deep Dive” is a must see webinar. Once done take a break and then watch it again, let it sink in and then sign-up for a free account with Azure or GCP and practice all that was learned during the webinar.
At the end of this exercise … one will begin to understand why the networking domain seems to be lagging behind … This webinar will help one pick up the pace!
Worth Watching: Rethinking BGP in the Data Center
Ever since draft-lapukhov was first published almost a decade ago, we all knew BGP was the only routing protocol suitable for data center networking… or at least Thought Leaders and vendor marketers seem to be of that persuasion.
Worth Exploring: Working with Linux VRFs
I remember having an interesting discussion about Linux VRFs (as opposed to namespaces) with Dinesh Dutt years ago, but it looks like I never turned it into a blog post.
Now I won’t have to 😉 – Jon Langemak published an excellent Working with Linux VRFs deep dive.