Blog Posts in October 2019
Worth Reading: How Complex Systems Fail
I don’t remember who pointed me to the excellent How Complex Systems Fail document. It’s almost like RFC1925 – I could quote it all day long, and anyone dealing with large mission-critical distributed systems (hint: networks) should read it once a day ;))
Enjoy!
Saved: TCP Is the Most Expensive Part of Your Data Center
Years ago Dan Hughes wrote a great blog post explaining how expensive TCP is. His web site is long gone, but I managed to grab the blog post before it disappeared and he kindly allowed me to republish it.
If you ask a CIO which part of their infrastructure costs them the most, I’m sure they’ll mention power, cooling, server hardware, support costs, getting the right people and all the usual answers. I’d argue one the the biggest costs is TCP, or more accurately badly implemented TCP.
Whitebox Hardware and Open-Source Software
One of my subscribers was interested in trying out whitebox solutions. He wrote:
What open source/whitebox software/hardware should I look at if I wanted to build a leaf-and-spine VXLAN/EVPN/BGP data center.
I don’t think you can get a fully-open-source solution because the ASIC manufacturers hide their SDK behind a mountain of NDAs (that strategy must make perfect sense – after all, it generated such awesome PR for NVIDIA). Anyway, the closest you can get (AFAIK) if you're a mere mortal is Cumulus Linux, and you just choose any whitebox hardware off their Hardware Compatibility List.
Worth Reading: Hard Work
Seth Godin published an interesting article on the value of hard work (and what hard work really is). Go and read it first, then we’ll translate it into networking terms.
Already back? Good, let’s go.
The first worker is a traditional networking technician (it wouldn’t be fair to call him an engineer) – he’s busy configuring VLANs, ACLs, firewall rules… the whole day.
OpenBGPD with Claudio Jeker on Software Gone Wild
Everyone is talking about FRRouting suite these days, while hidden somewhere in the background OpenBGPD has been making continuous progress for years. Interestingly, OpenBGPD project was started for the same reason FRR was forked - developers were unhappy with Zebra or Quagga routing suite and decided to fix it.
We discussed the history of OpenBGPD, its current deployments and future plans with Claudio Jeker, one of the main OpenBGPD developers, in Episode 106 of Software Gone Wild.
Master the Alternate "Public Cloud Networking" Universe
You probably heard me say “networking engineer encountering a public cloud feels like Alice in Wonderland” - packet forwarding works in a different way in every public cloud, subnets are a mix between routed interfaces and VRFs, you cannot change IP addresses without involving the orchestration system…
We covered the networking aspects of Amazon Web Services and Azure in our cloud webinars, but you might need a bigger picture:
Auto-MLAG and Auto-BGP in Cumulus Linux
When I first met Cumulus Networks engineers (during NFD9) their focus on simplifying switch configurations totally delighted me (video).
After solving the BGP configuration challenge (could you imagine configuring BGP in a leaf-and-spine fabric with just a few commands in 2015), they did the same thing with EVPN configuration, where they decided to implement the simplest possible design (EBGP-only fabric running EBGP EVPN sessions on leaf-to-spine links), resulting in another round of configuration simplicity.
Can We Make REST API Transactional Across Multiple Calls?
I got interesting feedback from one of my readers after publishing my REST API Is Not Transactional blog post:
One would think a transactional REST interface wouldn’t be too difficult to implement. Using HTTP1/1, it is possible to multiplex several REST calls into one connection to a specific server. The first call then is a request for start a transaction, returning a transaction ID, to be used in subsequent calls. Since we’re not primarily interested in the massive scalability of stateless REST calls, all the REST calls will be handled by the same frontend. Obviously the last call would be a commit.
I wouldn’t count on HTTP pipelining to keep all requests in one HTTP session (mixing too many layers in a stack never ends well) but we wouldn’t need it anyway the moment we’d have a transaction ID which would be identical to session ID (or session cookie) traditional web apps use.
SD-WAN Vendor Landscape
In the Three Paths of Enterprise IT part of Business Aspects of Networking webinar I covered the traditional networking vendor landscape. Let’s try to do the same for SD-WAN.
It’s clear that we had two types of SD-WAN vendors:
MUST READ: The NTP Bible
A few months ago Johannes Weber sent me a short email saying “hey, I plan to write a few NTP posts” and I replied “well, ping me when you have something ready”.
In the meantime he wrote a veritable NTP bible - a series of NTP-related blog posts covering everything from Why Should I Run My Own NTP Servers to authentication, security and monitoring - definitely a MUST READ if you care about knowing what time it is.
You Cannot Have a Public Cloud without Networking
Listening to (some) industry evangelists you would believe that there’s no future in being a networking engineer. After all, all workloads will move into the cloud, and all clients will connect through a universal 5G network… but even if that utopia eventually comes true, you can’t get away from the laws of physics (and the need networking infrastructure).
TL&DR: our new online course will help you master the shiny new world. You can register right now or keep reading ;)
Disaster Recovery Faking, Take Two
An anonymous (for reasons that will be obvious pretty soon) commenter left a gem on my Disaster Recovery Test Faking blog post that is way too valuable to be left hidden and unannotated.
Here’s what he did:
Once I was tasked to do a DR test before handing over the solution to the customer. To simulate the loss of a data center I suggested to physically shutdown all core switches in the active data center.
That’s the right first step: simulate the simplest scenario (total DC failure) in a way that is easy to recover (power on the switches). It worked…
How Did We End with 1500-byte MTU?
A subscriber sent me this intriguing question:
Is it not theoretically possible for Ethernet frames to be 64k long if ASIC vendors simply bothered or decided to design/make chipsets that supported it? How did we end up in the 1.5k neighborhood? In whose best interest did this happen?
Remember that Ethernet started as a shared-cable 10 Mbps technology. Transmitting a 64k frame on that technology would take approximately 50 msec (or as long as getting from East Coast to West Coast). Also, Ethernet had no tight media access control like Token Ring, so it would be possible for a single host to transmit multiple frames without anyone else getting airtime, resulting in unacceptable delays.
How Do You Provision a 500-Switch Network in a Few Days?
TL&DR: You automate the whole process. What else do you expect?
During the Tech Field Day Extra @ Cisco Live Europe 2019 we were taken on a behind-the-stage tour that included a chat with people who built the Cisco Live network, and of course I had to ask how they automated the whole thing. They said “well, we have the guy that wrote the whole system onsite and he’ll be able to tell you more”. Turns out the guy was my good friend Andrew Yourtchenko who graciously showed the system they built and explained the behind-the-scenes details.
Must read: Shades of Lock-in
Gregor Hohpe published an excellent series on Martin Fowler’s web site focusing on various aspects of lock-in. If nothing else, you SHOULD read the shades of lock-in part, and combine it with my thoughts on lock-in in data center networking.
Video: Retransmissions and Flow Control in Computer Networks
Grouping the features needed in a networking stack in a bunch of layered modules is a great idea. Unfortunately, you could place several essential features like error recovery, retransmission, and flow control in several different layers, from the data link layer dealing with individual network segments, to the transport layer dealing with reliable end-to-end transmissions.
Where should we put those modules? As always, the correct answer is it depends, in this particular case, on transmission reliability, latency, and bandwidth cost. You’ll find more details in the Retransmissions and Flow Control part of How Networks Really Work webinar.
Automation Solution: Network Health State Report
How nice would it be to have a fabric health dashboard displaying a summary of numerous parameters you’re interested in (number of operational uplinks, number of BGP sessions…) for every switch in your fabric.
I’m positive you could hack something together using the customization capabilities of your favorite network management system… or you could write a simple data gathering solution like Stephen Harding did while attending the Building Network Automation Solutions online course.
VMware NSX Killed My EVPN Fabric
I had an interesting discussion with someone running VMware NSX on top of VXLAN+EVPN fabric a while ago. That’s a pretty common scenario considering:
- NSX’s insistence on having all VXLAN uplink from the same server in the same subnet;
- Data center switching vendors being on a lemming-like run praising EVPN+VXLAN;
- The reluctance of non-FAANG environments to connect a server to a single switch.
Apart from the weird times when someone started tons of new VMs, his fabric was running well.
The Cost of Disruptiveness and Guerrilla Marketing
A Docker networking rant coming from my good friend Marko Milivojević triggered a severe case of Deja-Moo, resulting in a flood of unpleasant memories caused by too-successful “disruptive” IT vendors.
Imagine you’re working for a startup creating a cool new product in the IT infrastructure space (if you have an oversized ego you would call yourself “disruptive thought leader” on your LinkedIn profile) but nobody is taking you seriously. How about some guerrilla warfare: advertising your product to people who hate the IT operations (today we’d call that Shadow IT).
Optimizing Environment Setup in Ansible Playbooks
Have you ever seen an Ansible playbook where 90% of the code prepares the environment, and then all the work is done in a few template and assemble modules? Here’s an alternative way of getting that done. Is it better? You tell me ;)
Worth Reading: Anycast DNS in Enterprise Networks
Anycast (advertising the same IP address from multiple servers/locations) has long been used to implement scale-out public DNS services (the whole root DNS system runs on massive anycast), but it’s not as common in enterprise networks.
The blog posts written by Tom Bowles should get you there. He started with the idea and described his implementation using Infoblox DNS.
Want to know even more? I covered numerous load balancing mechanisms including anycast in Data Centers Infrastructure for Networking Engineers webinar.
Redundant BGP Connectivity on a Single ISP Connection
A while ago Johannes Weber tweeted about an interesting challenge:
We want to advertise our AS and PI space over a single ISP connection. How would a setup look like with 2 Cisco routers, using them for hardware redundancy? Is this possible with only 1 neighboring to the ISP?
Hmm, so you have one cable and two router ports that you want to connect to that cable. There’s something wrong with this picture ;)
Network Automation Beyond Configuration Templating
Remember Nicky Davey describing how he got large DMVPN deployment back on track with configuration templating? In his own words…:
Configuration templating is still as big win a win for us as it was a year ago. We have since expanded the automation solution, and reading the old blog post makes me realise how far we have come. I began working with this particular customer in May 2017, so 2 years now. At that time the new WAN project was on the horizon and the approach to network configuration was entirely manual.
Here’s how far he got in the meantime:
New Content: Azure Networking and Automation Source-of-Truth
Last week I covered network security groups, application security groups and user-defined routes in the second live session of Azure Networking webinar.
We also had a great guest speaker on the Network Automation course: Damien Garros explained how he used central source-of-truth based on NetBox and Git to set up a network automation stack from the grounds up.
Recordings are already online; you’ll need Standard ipSpace.net Subscription to access the Azure Networking webinar, and Expert ipSpace.net Subscription to access Damien’s presentation. Azure Networking webinar is also part of our new Networking in Public Clouds online course.
Changing Cisco IOS BGP Policies Based on IP SLA Measurements
This is a guest blog post by Philippe Jounin, Senior Network Architect at Orange Business Services.
You could use track objects in Cisco IOS to track route reachability or metric, the status of an interface, or IP SLA compliance for a long time. Initially you could use them to implement reliable static routing (or even shut down a BGP session) or trigger EEM scripts. With a bit more work (and a few more EEM scripts) you could use object tracking to create time-dependent static routes.
Cisco IOS 15 has introduced Enhanced Object Tracking that allows first-hop router protocols like VRRP or HSRP to use tracking state to modify their behavior.