Bill Dagy sent me an annoying ISR gotcha. In his own words:
Since you have a large audience I thought I would throw this out here. Maybe it will help someone avoid spending 80 man hours troubleshooting network slowdowns.
Here’s the root cause of that behavior:
Cisco is now shipping routers that have some specified maximum throughput, but you have to buy a “boost license” to run them unthrottled. Maybe everyone already knew this but it sure took us by surprise.
Don’t believe it? Here’s a snapshot from Cisco 4000 Family Integrated Services Router Data Sheet:
A while ago someone pointed me to an interesting talk explaining why 99th percentile represents a pretty good approximation of user-experienced latency on a typical web page (way longer version: Understanding Latency and Application Responsiveness, also How I Learned to Stop Worrying and Love Misery)
If you prefer reading instead of watching videos, there’s also everything you know about latency is wrong.
Maybe it’s time to build our own network monitoring systems from open-source components instead of paying vendors big bucks for slick PowerPoint slides.
In January 2020 Doug Heckaman documented his experience with VeloCloud SD-WAN. He tried to be positive, but for whatever reason this particular bit caught my interest:
Edge Gateways have a limited number of tunnels they can support […]
WTF? Wasn’t x86-based software packet forwarding supposed to bring infinite resources and nirvana? How badly written must your solution be to have a limited number of IPsec tunnels on a decent x86 CPU?
In the second half of his Networks, Buffers and Drops webinar JR Rivers focused on end systems: what tools could you use to measure end-to-end TCP throughput, or monitor performance of an individual socket or the whole TCP stack?
You’ll need at least free ipSpace.net subscription to watch the video.
Here’s the question I got from one of my readers:
Do you have any data available to show the benefits of jumbo frames in 40GE/100GE networks?
In case you’re wondering why he went down this path, here’s the underlying problem:
We did several podcasts describing how one could get stellar packet forwarding performance on x86 servers reimplementing the whole forwarding stack outside of kernel (Snabb Switch) or bypassing the Linux kernel and moving the packet processing into userspace (PF_Ring).
Now let’s see if it’s possible to improve the Linux kernel forwarding performance. Thomas Graf, one of the authors of Cilium claims it can be done and explained the intricate details in Episode 64 of Software Gone Wild.
In Cisco IOS, when a packet is marked by IOS for ICMP redirect to a better gateway, that packet is being punted to the CPU, right?
It depends on the platform, but it’s going to hurt no matter what.
Marek Majkowski published an awesome real-life story on CloudFlare blog: users experienced occasional short-term sluggish performance and while everything pointed to a network problem, it turned out to be a garbage collection problem in Linux kernel.
Takeaway: It might not be the network's fault.
Also: How many people would be able to troubleshoot that problem and fix it? Technology is becoming way too complex, and I don’t think software-defined-whatever is the answer.
When writing the Packet- and Flow-Based Forwarding blog post, I tried to find a good definition of flow-based forwarding (and I was not the only one being confused), and the one from Junos SRX documentation is as good as anything else I found, so let’s use it.
TL&DR: Flow-based forwarding is a valid technical concept. However, when mentioned together with OpenFlow, it’s mostly marketing fluff.
One of the participants of the Carrier Ethernet LinkedIn group asked a great question:
When we install a virtual-router of any vendor over an ordinary sever (having general-purpose microprocessor), can it really compete with a physical-router having ASICs, Network Processors…?
Short answer: No … and here’s my longer answer (cross-posted to my blog because not all of my readers participate in that group).