« ipSpace.net blog

Wednesday, November 11, 2020 08:10 UTC

Appreciating the Networking Fundamentals

When I started creating the How Networks Really Work series, I wondered whether our subscribers (mostly seasoned networking engineers) would find it useful. Turns out at least some of them do; this is what a long-time subscriber sent me:

How Networks Really Work is great, it’s like looking from a plane and seeing how all the roads are connected to each other. I know networking just enough to design and manage a corporate network, but there are many things I have learned, used and forgotten along the way.

So, getting a broad vision helps me remember why I chose something and maybe solve my bad choices. There are many things that I may never use, but with the movement of all things in the cloud it’s great to know, or at least understand, how things really work.

add comment

networking fundamentals

Monday, November 30, 2020 07:47 UTC

Worth Exploring: Pluginized Protocols

Remember my BGP route selection rules are a clear failure of intent-based networking paradigm blog post? I wrote it almost three years ago, so maybe you want to start by rereading it…

Making long story short: every large network is a unique snowflake, and every sufficiently convoluted network architect has unique ideas of how BGP route selection should work, resulting in all sorts of crazy extended BGP communities, dozens if not hundreds of nerd knobs, and 2000+ pages of BGP documentation for a recent network operating system (no, unfortunately I’m not joking).

read more add comment

Saturday, November 28, 2020 09:12 UTC

Fun Times: Another Broken Linux ALG

Dealing with protocols that embed network-layer addresses into application-layer messages (like FTP or SIP) is great fun, more so if the said protocol traverses a NAT device that has to find the IP addresses embedded in application messages while translating the addresses in IP headers. For whatever reason, the content rewriting functionality is called application-level gateway (ALG).

Even when we’re faced with a monstrosity like FTP or SIP that should have been killed with napalm a microsecond after it was created, there’s a proper way of doing things and a fast way of doing things. You could implement a protocol-level proxy that would intercept control-plane sessions… or you could implement a hack that tries to snoop TCP payload without tracking TCP session state.

Not surprisingly, the fast way of doing things usually results in a wonderful attack surface, more so if the attacker is smart enough to construct HTTP requests that look like SIP messages. Enjoy ;)

add comment

security

Friday, November 27, 2020 07:42 UTC

Reviving Old Content, Part 1

More than a decade ago I published tons of materials on a web site that eventually disappeared into digital nirvana, leaving heaps of broken links on my blog. I decided to clean up those links, and managed to save some of the vanished content from the Internet Archive:

I also updated dozens of blog posts while pretending to be Indiana Jones, including:

read more add comment

BGP
MPLS

Thursday, November 26, 2020 06:11 UTC

Over 300 Hours of Subscription Content on ipSpace.net

It’s amazing how far you can get if you keep doing something for a long-enough time. In a bit over 10 years (the initial versions of the earliest still-active webinars were created in October 2010), we accumulated over 300 hours of online content available with ipSpace.net subscription, plus another 130 hours of online course content.

Obviously I couldn’t have done that myself. Thanks a million to Irena who took over most of the day-to-day business a few years ago, dozens of authors, and thousands of subscribers who enabled us to make it all happen.

see 2 comments

Wednesday, November 25, 2020 07:09 UTC

Growing Beyond Networking Skills

One of my subscribers trying to figure out how to improve his career choices sent me this question:

I am Sr. Network Engineer with 12+ Years’ experience. I was quit happy with my networking skills but will all the recent changes I’m confused. I am not able to understand what are the key skills I should learn as a network engineer to keep myself demandable.

Before reading the rest of this blog post, please read Cloud and the Three IT Geographies by Massimo Re Ferre.

Fast Failover: Hardware and Software Implementations

In previous blog posts in this series we discussed whether it makes sense to invest into fast failover network designs, the topologies you can use in such designs, and the fault detection techniques. I also hinted at different fast failover implementations; this blog post focuses on some of them.

Hardware-based failover changes the hardware forwarding tables after a hardware-detectable link failure, most likely loss-of-light or transceiver-reported link fault. Forwarding hardware cannot do extensive calculations; the alternate paths are thus usually pre-programmed (more details below).

Why Is Public Cloud Networking So Different?

A while ago (eons before AWS introduced Gateway Load Balancer) I discussed the intricacies of AWS and Azure networking with a very smart engineer working for a security appliance vendor, and he said something along the lines of “it shows these things were designed by software developers – they have no idea how networks should work.”

In reality, at least some aspects of public cloud networking come closer to the original ideas of how IP and data-link layers should fit together than today’s flat earth theories, so he probably wanted to say “they make it so hard for me to insert my virtual appliance into their network.”

Worth Reading: Do Your Homework

Tom Hollingsworth wrote another must-read blog post in which he explained what one should do before asking for help:

If someone comes to me and says, “I tried this and it failed and I got this message. I looked it up and the response didn’t make sense. Can you tell me why that is?” I rejoice. That person has done the legwork and narrowed the question down to the key piece they need to know.

In other words (again his), do your homework first and then ask relevant questions.

add comment

training

Friday, November 20, 2020 07:59 UTC

Video: Know Your Users' Needs

After explaining why you should focus on defining the problem before searching for a magic technology that will solve it, I continued the Focus on Business Challenges First presentation with another set of seemingly simple questions:

Who are your users/customers?
What do they really need?
Assuming you’re a service provider, what are you able to sell to your customers… and how are you different from your competitors?

Watch the video

The video is part of Business Aspects of Networking Technologies webinar and available with Free ipSpace.net Subscription.

add comment

Thursday, November 19, 2020 07:28 UTC

Fast Failover: Topologies

In the blog post introducing fast failover challenge I mentioned several typical topologies used in fast failover designs. It’s time to explore them.

The Basics

Fast failover is (by definition) adjustment to a change in network topology that happens before a routing protocol wakes up and deals with the change. It can therefore use only locally available information, and cannot involve changes in upstream devices. The node adjacent to the failed link has to deal with the failure on its own without involving anyone else.

read more add comment

IP routing

Wednesday, November 18, 2020 07:53 UTC

Why Is OSPF not Using TCP?

A Network Artist sent me a long list of OSPF-related questions after watching the Routing Protocols section of our How Networks Really Work webinar. Starting with an easy one:

From historical perspective, any idea why OSPF guys invented their own transport protocol instead of just relying upon TCP?

I wasn’t there when OSPF was designed, but I have a few possible explanations. Let’s start with the what functionality should the transport protocol provide reasons:

How Fast Can We Detect a Network Failure?

In the introductory fast failover blog post I mentioned the challenge of fast link- and node failure detection, and how it makes little sense to waste your efforts on fast failover tricks if the routing protocol convergence time has the same order of magnitude as failure detection time.

Now let’s focus on realistic failure detection mechanisms and detection times. Imagine a system connecting a hardware switching platform (example: data center switch or a high-end router) with a software switching platform (midrange router):

Self-promotion Disguised as Research Paper

From AI is wrestling with a replication crisis (HT: Drew Conry-Murray)

Last month Nature published a damning response written by 31 scientists to a study from Google Health that had appeared in the journal earlier this year. Google was describing successful trials of an AI that looked for signs of breast cancer in medical images. But according to its critics, the Google team provided so little information about its code and how it was tested that the study amounted to nothing more than a promotion of proprietary tech (emphasis mine).

No surprise there, we’ve seen it before (not to mention the “look how awesome we are, but we can’t tell you the details” Jupiter Rising article).

add comment

worth reading

Friday, November 13, 2020 07:25 UTC

Video: Getting a Packet Across a Network

After (hopefully) agreeing on what routing, bridging, and switching are, let’s focus on the first important topic in this area: how do we get a packet across the network? Yet again, there are three fundamentally different technologies:

Source node knows the full path (source routing)
Source node opens a path (virtual circuit) to the destination node and uses that path to send traffic
The network performs hop-by-hop destination-address-based packet forwarding.

More details in the Getting Packets Across the Network video.

Watch the video

The video is part of How Networks Really Work webinar and available with Free ipSpace.net Subscription.

add comment

Thursday, November 12, 2020 07:15 UTC

Worth Reading: Protocol Options Rusted Shut

A long while ago I found a great article explaining TLS 1.3 and its migration woes on CloudFlare blog. While I would strongly recommend you read it just to get familiar with TLS 1.3, the real fun starts when the author discusses migration problems, kludges you have to use trying to fix them, less-than-compliant implementations breaking those kludges, and options that were supposed to be dynamic, but turn out to be static (rusted shut) due to middleboxes that implemented protocols as-seen-in-the-wild not as-described-in-RFCs.

Change a few TLAs and you could be reading about TCP, IP stack, IPv6, BGP… I addressed those aspects in the ossification and centralization part of Upcoming Internet Challenges webinar.

add comment

Internet

Tuesday, November 10, 2020 07:14 UTC

Fast Failover: The Challenge

Sometimes you’re asked to design a network that will reroute around a failure in milliseconds. Is that feasible? Maybe. Is it simple? Absolutely not.

In this series of blog posts we’ll start with the basics, explore the technologies that you can use to reach that goal, and discover one or two unexpected rabbit holes.

Fast failover is just one of the topics we’ll discuss in Advanced Routing Protocol Features part of How Networks Really Work webinar.

New Content: VMware NSX-T 3.0 Update

When VMware NSX-T 3.0 came out, I planned to do an update session of the VMware NSX Technical Deep Dive webinar along the lines of what I did for AWS Networking a few weeks ago. However, it turned out that most of the new features didn’t take more than a bullet or two on an existing slide, or at most a new slide.

Covering them in a live session and then slicing-and-dicing the resulting recording simply didn’t make sense, so I updated the videos in summer 2020 (the last batch was published in early August).

read more add comment

NSX

Saturday, November 7, 2020 07:07 UTC

Worth Reading: The Trap of The Premature Senior

Here’s another riff on the “when you’re the smartest person in the room, change the room” theme: The Trap of The Premature Senior by inimitable Charity Majors. Enjoy!

add comment

worth reading

Friday, November 6, 2020 07:37 UTC

Video: NetQ and Cumulus Linux Data Models

In the last part of his Cumulus Linux 4.0 Update Pete Lumbis talked about using NetQ to capture streaming telemetry and increase network observability, and the new model-driven configuration approach (including all the usual buzzwords like NETCONF, RPC, YAML, JSON, and OpenConfig) coming in 2020.

Watch the video

You need Free ipSpace.net Subscription to watch the video.

add comment

Thursday, November 5, 2020 07:47 UTC

Renumbering Public Cloud Address Space

Got this question from one of the networking engineers “blessed” with rampant clueless-rush-to-the-cloud.

I plan to peer multiple VNet from different regions. The problem is that there is not any consistent deployment in regards to the private IP subnets used on each VNet to the point I found several of them using public IP blocks as private IP ranges. As far as I recall, in Azure we can’t re-ip the VNets as the resource will be deleted so I don’t see any other option than use NAT from offending VNet subnets to use my internal RFC1918 IPv4 range. Do you have a better idea?

The way I understand Azure, while you COULD have any address range configured as VNet CIDR block, you MUST have non-overlapping address ranges for VNet peering.

read more add comment

cloud
Azure

Wednesday, November 4, 2020 06:02 UTC

Do We Need LFA or FRR for Fast Failover in ECMP Designs?

One of my readers sent me a question along these lines:

Imagine you have a router with four equal-cost paths to prefix X, two toward upstream-1 and two toward upstream-2. Now let’s suppose that one of those links goes down and you want to have link protection. Do I really need Loop-Free Alternate (LFA) or MPLS Fast Reroute (FRR) to get fast (= immediate) failover or could I rely on multiple equal-cost paths to get the job done? I’m getting different answers from different vendors…

Please note that we’re talking about a very specific question of whether in scenarios with equal-cost layer-3 paths the hardware forwarding data structures get adjusted automatically on link failure (without CPU reprogramming them), and whether LFA needs to be configured to make the adjustment happen.

MUST READ: How to troubleshoot routing protocols session flaps

Did you ever experience an out-of-the-blue BGP session flap after you were running that peering for months? As Dmytro Shypovalov explains in his latest blog post, it’s always MTU (just kidding, of course it’s always DNS, but MTU blackholes nonetheless result in some crazy behavior).

Keep reading

add comment

Monday, November 2, 2020 17:28 UTC

New Ansible for Networking Engineers Content

When restructuring our online courses we decided to make the video content that was previously part of Ansible online course available with Standard ipSpace.net Subscription.

If you haven’t enrolled into our automation online course (which always included the extra bits) you’ll find the following additional content in our Ansible for Networking Engineers webinar:

read more add comment

Blog Posts in November 2020

The Basics