Category: networking fundamentals
After figuring out what business problem you’re trying to solve and what the users expect to get from you it’s time for the next crucial question: should you buy a shrink-wrapped product/solution or build your own? I addressed that question in the third part of Focus on Business Challenges First presentation.
Not surprisingly, the same dilemma applies to network automation solutions, and is often the source of endless time-wasting discussions that I really should have stopped engaging in, but sometimes duty calls ;)
One of my readers encountered an interesting problem when upgrading a data center fabric to 100 Gbps leaf-to-spine links:
- They installed new fiber cables and SFPs;
- Everything looked great… until someone started complaining about application performance problems.
- Nothing else has changed, so the culprit must have been the network upgrade.
- A closer look at monitoring data revealed CRC errors on every leaf switch. Obviously something was badly wrong with the whole batch of SFPs.
Fortunately my reader took a closer look at the data before they requested a wholesale replacement… and spotted an interesting pattern:
After explaining why you should focus on defining the problem before searching for a magic technology that will solve it, I continued the Focus on Business Challenges First presentation with another set of seemingly simple questions:
- Who are your users/customers?
- What do they really need?
- Assuming you’re a service provider, what are you able to sell to your customers… and how are you different from your competitors?
From historical perspective, any idea why OSPF guys invented their own transport protocol instead of just relying upon TCP?
I wasn’t there when OSPF was designed, but I have a few possible explanations. Let’s start with the what functionality should the transport protocol provide reasons:
In the introductory fast failover blog post I mentioned the challenge of fast link- and node failure detection, and how it makes little sense to waste your efforts on fast failover tricks if the routing protocol convergence time has the same order of magnitude as failure detection time.
Now let’s focus on realistic failure detection mechanisms and detection times. Imagine a system connecting a hardware switching platform (example: data center switch or a high-end router) with a software switching platform (midrange router):
After an easy start defining flows and walking us through various maximum flow algorithms, Rachel explained circulations and saturating flows, switched into high gear with (supposedly painless) intro to linear programming and minimum cost flow problems, and concluded with dynamic flows and using flows to explore graph connectivity.
After (hopefully) agreeing on what routing, bridging, and switching are, let’s focus on the first important topic in this area: how do we get a packet across the network? Yet again, there are three fundamentally different technologies:
- Source node knows the full path (source routing)
- Source node opened a path (virtual circuit) to the destination node and uses that path to send traffic
- The network performs hop-by-hop destination-address-based packet forwarding.
More details in the Getting Packets Across the Network video.
When I started creating the How Networks Really Work series I wondered whether our subscribers (mostly seasoned networking engineers) would find it useful. Turns out at least some of them do; this is what a long-time subscriber sent me:
How Networks Really Work is great, it’s like looking from a plane and seeing how all the roads are connected to each other. I know networking just enough to design and manage a corporate network, but there are many things I have learned, used and forgotten along the way.
So, getting a broad vision helps me remember why I chose something and maybe solve my bad choices. There are many things that I may never use, but with the movement of all things in the cloud it’s great to know, or at least understand, how things really work.
I should have known better, but I got pulled into another stretched VLANs for disaster recovery tweetfest. Surprisingly, most of the tweets were along the lines of you really shouldn’t be doing that and that would never work well, but then I guess I was only exposed to a small curated bubble of common sense… until this gem appeared in my timeline:
Interestingly, that’s exactly how IP works:
In her lecture you’ll find:
- maximum branching algorithms (and I couldn’t stop wondering why we don’t use them for OSPF- or IS-IS flooding)
- path algorithms including the ones used in OSPF, IS-IS, or BGP, as well as algorithms that find K shortest paths
- center problems (for example: where do I put my streaming server or my BGP route reflector)
If you’re working solely with IP-based networks, you’re probably quick to assume that hop-by-hop destination-only forwarding is the only packet forwarding paradigm that makes sense. Not true, even today’s networks use a variety of forwarding mechanisms, most of them called some variant of routing or switching.
What exactly is the difference between the two, and what is bridging? I’m answering these questions (and a few others like what’s the difference between data-, control- and management planes) in the Bridging, Routing and Switching Terminology video.
In December 2019 I finally turned my focus on business challenges first presentation into a short webinar session (part of Business Aspects of Networking Technologies webinar) starting with defining the problem before searching for a solution including three simple questions:
- What BUSINESS problem are you trying to solve?
- Are there good-enough alternatives or should you really invest into new technology and/or equipment?
- Is the problem worth solving?
The last Fallacy of Distributed Computing I addressed in the introductory part of How Networks Really Work webinar was The Network Is Homogenous. No, it’s not and it never was… for more details watch this video.
One of the most annoying part in every training content development project was the ubiquitous question somewhere at the end of the process: “and now we’d need a few review questions”. I’m positive anyone ever involved in a similar project can feel the pain that question causes…
Writing good review questions requires a particularly devious state of mind, sometimes combined with “I would really like to get the answer to this one” (obviously you’d mark such questions as “needs further research”, and if you’re Donald Knuth the question would be “prove that P != NP").
Avery Pennarun continued his if only IPv6 would be less academic saga with a must-read IPv4, IPv6, and a sudden change in attitude article in which he (among other things) correctly identified IPv6 as a typical example of second-system effect:
If we were feeling snarky, we could perhaps describe IPv6 as “the String Theory of networking”: a decades-long boondoggle that attracts True Believers, gets you flamed intensely if you question the doctrine, and which is notable mainly for how much progress it has held back.
In the end, his conclusion matches what I said a decade ago: if only the designers of the original Internet wouldn’t be too stubborn to admit a networking stack needs a session layer. For more details, watch The Importance of Network Layers part of Networks Really Work webinar