Category: Tags
AI
Artificial Intelligence (AI) and Machine Learning (ML) are the next big hype in networking following Software-Defined Everything and Intent-Based Everything. Like with the previous hype bubbles it’s worth figuring out
- How much of the hype is real (TL&DR: not much)?
- Whether the technology is ready to be used in production networks (TL&DR: some of it)
- How you could use the technology to make your life easier
How Real Is It?
Like with the previous hype tsunamis I’ll do my best to help you figure out the answers to the above questions with a hefty dose of skepticism and snark1, starting with:
I also decided to “kick the tires” and document my (often less-than-stellar) experience with the most-overhyped products:
- Real-Life Not-Exactly-Networking AI Use Case
- ChatGPT on BGP Routing Security
- Kicking the Tires of GitHub Copilot
- Building a Small Network with ChatGPT
- ChatGPT Explaining the Need for iSCSI CRC
- Source IP Address in Multicast Packets
AI/ML in Networking: The Good, the Bad and the Ugly
Javier Antich created a wonderful AI/ML in Networking in 2021. If you know nothing about AI/ML and wonder whether you should care about it, you MUST watch these videos from his webinar:
- Introduction to AI/ML Hype
- Machine Learning 101
- Machine Learning Techniques
- Use Cases for AI/ML in Networking
- The Long Tail of AI/ML Problems
- Ugly Challenges of Using AI/ML in Networking
- Language Models in AI/ML Landscape
- Language Model Basics
In 2023, Javier published a book covering the same set of topics in way more details. I would highly recommend you read it if you want to know more.
What Others Are Saying
I keep collecting interesting articles talking about AI in general and (lately) ChatGPT. I found these interesting enough to mention them in worth reading blog posts:
- MUST READ: ChatGPT Is Bullshit (2024)
- Machine Learning Explained (2020)
- AI Makes Animists of Us All (2022)
- The AI Illusion (2022)
- Collections: On ChatGPT (a Historian Perspective) (2023)
- Putting Large Language Models in Context (2023)
- The Dangers of Knowing Everything (2023)
- Building Trustworthy AI (2023)
- Cargo Cult AI (2023)
- Building Stuff with Large Language Models Is Hard (2023)
- Worth Reading: AI Does Not Help Programmers (2023)
- Eyes that glaze over. Eyes like saucers. Eyes that narrow. (2023)
- Networking for AI Workloads (2023)
- Looking Inside Large Language Models (2023)
- Where Are the Self-Driving Cars? (2023)
- AI Risks (2023)
- State-of-the-Art AI (2023)
- The AI Supply Paradox (2023)
- ChatGPT Does Not Summarize (2024)
- You Probably Don't Need AI (2024)
- GitHub Copilot Workspace Review (2024)
- AI Is Still a Delusion (2024)
- AI and Google’s Quarterly Results (2024)
These are not bad either:
- What Is ChatGPT Doing … and Why Does It Work?
- We Can’t Build a Hut to the Moon
- The Delusion at the Center of the A.I. Boom (aka AI Solutionism)
- ChatGPT and Chemistry
- Cal Newport on ChatGPT
- Ruby Development with ChatGPT
- ChatGPT Is Your New Intern
- Using ChatGPT as a Technical Writing Assistant
- Why OpenAI is the new AWS
- Overemployed Hustlers Exploit ChatGPT To Take On Even More Full-Time Jobs
Finally, a few real-life uses of large language models:
- An Exploration of Embeddings and Vector Databases
- How GPT and LLMs will affect documentation
- I Built an AWS Well-Architected Chatbot with ChatGPT
- Building Boba AI – how to build a custom user interface in front of a large language model.
- Using Langchain to interact with ChatGPT
Blog Posts I Forgot to Categorize
-
Please don’t blame me for pointing out the ever-lasting validity of Sturgeon’s law. Contrary to what some people think, I’m not trying hard to pick up dismal examples of AI failures, I’m just good at looking in the wrong places. Also, I’m too old to be wearing rosy glasses and drinking Kool-Aid. ↩︎
OSPF
ChatGPT explaining OSPF to a high-school kid
Configuration Tips
This blog started as a collection of (hopefully) helpful configuration tricks, and I documented numerous Cisco IOS configuration tips in the early 2000s.
- Network Statements in the OSPF Process Are No Longer Order-Dependent
- Enhanced OSPF Adjacency Logging
- Network Statements Are No Longer Needed in OSPF Configuration
- Be Smart When Using the OSPF Network Statement
- Increased Number of OSPF processes in MPLS VPN Environments
- OSPF Router-Id Does Not Change When the Interface IP Address Changes
- Subnet Masks in OSPF Network Statements
- OSPF in a VRF Requires a Box-Unique Router ID
- Reverse Lookup of OSPF Router IDs
- Display Interfaces Belonging to a Single OSPF Process
- OSPF Ignores Subnet Mask Mismatch on Point-to-Point Links
- IOS Fossils: OSPF-to-BGP Redistribution
- Limitations of VRF Routing Protocols on Cisco IOS
- IOS Fossils: Classful OSPF Redistribution
- “ip ospf mtu-ignore” Is a Dangerous Command
- OSPF and Connected Networks: To Redistribute or Not?
- Unnumbered OSPF Interfaces in Quagga (and Cumulus)
Implementation Details
Let’s start with the elephant in the room: OSPF areas – a simple concept that got way too convoluted when OSPF started accreting nerd knobs like NSSA areas:
- Primary/Backup Area Border Router Designs
- Do We Still Need OSPF Areas and Summarization?
- OSPF Areas and Summarization: Theory and Reality
- Nerd Knobs Save the Day: NSSA Saga Continues
- Running OSPF in a Single Non-Backbone Area
- OSPF Summarization and Split Areas
OSPF default routes are another confusing topic. You could have inter-area default routes (used in stub areas) or external default routes that could be conditional or unconditional.
- Inserting Default Route Into OSPF
- OSPF Default Route: Design Scenarios
- tested configuration
- OSPF Default Route Based on IP SLA
- Default Routing in NSSA Area
OSPF adjacencies are another fun troubleshooting topic:
- OSPF Neighbors Stuck in EXSTART
- Troubleshooting OSPF Adjacencies
- Challenge: Establish OSPF Adjacency on a LAN Interface
- OSPF LAN Adjacency Challenge: Final Results
- Challenge: Mixing Numbered and Unnumbered Interfaces
- Mixing Numbered and Unnumbered OSPF Interfaces: Solution
The inimitable forwarding address in type-5 LSA will make your head explode when combined with the NSSA areas.
- OSPF Forwarding Address: Yet Another Kludge
- OSPF Forwarding Address YAK: Take 2
- Why OSPF Needs Forwarding Address With NSSA Areas
- The Unintended Consequences of NSSA Kludges
- More Thoughts on OSPF Forwarding Address
Want even more OSPF details? I documented way too many of them since I started blogging, including:
- Type-1 (Router) LSA in OSPF Topology Database
- OSPF Graceful Shutdown
- More Details on OSPF Route Filters
- Common Sense Prevails Over RFC 2328
- OSPFv3 Router ID: the Long Shadow of IPv4
- OSPF Breaks When Faced With Overlapping IP Addresses
- OSPF Router ID Selection Trivia
- SPF Events in OSPF and IS-IS
- OSPF Router ID Selection: the Gory Details
- OSPF Route Selection Rules
- Change in OSPF Designated Router Creates Extra Network LSAs
- Inter-Process OSPF Route Selection Rules
- Why Is OSPF not Using TCP?
- What Exactly Happens after a Link Failure?
- OSPF Inter-Process Route Selection
- LSA/LSP Flooding in OSPF and IS-IS
- OSPF External Routes (Type-5 LSA) Mysteries
Deploying OSPF
Creative networking engineers often forget an unpleasant truth: OSPF is a single security domain. You should never run it with less-trusted peers, be it your customers, data center servers, or virtual machines.
- Do Not EVER Run OSPF or IS-IS With Your Internet Customers
- Don’t Run OSPF with Your Customers
- Why Would I Use BGP and not OSPF between Servers and the Network?
OSPF by itself is complex enough, but the real fun starts when you combine it with other protocols (for example, BGP and LDP):
- MPLS LDP Autoconfiguration
- Use Slow IGP Startup in LDP-only MPLS Environments
- LDP-IGP Synchronization in MPLS Networks
- OSPF Meets EIGRP
- Fixing the FIB Bottleneck
- Synchronizing BGP and OSPF (or OSPF and LDP)
- Unexpected Interactions Between OSPF and BGP
- Why Do We Need BGP-LS?
Running OSPF in large hub-and-spoke networks (for example, large DMVPN networks) is another tough challenge:
- OSPF Flooding Filters in Hub-and-Spoke Environments
- RIP Rocks in Low-End Hub-and-Spoke Networks
- Can You Run OSPF over DMVPN?
- Sometimes You Need to Step Back and Change Your Design
- OSPF Configuration in Phase 1 DMVPN Network
- Configuring OSPF in a Phase 2 DMVPN network
- More OSPF-over-DMVPN Questions
- DMVPN as a Backup for MPLS/VPN
- OSPF-over-DMVPN Using Two Hub Routers
- Redundant DMVPN designs, Part 1 (The Basics)
- Combining DMVPN with Existing MPLS/VPN Network
While you could use OSPF to get unequal-cost multipathing, you might be tripped by numerous caveats; no wonder there are few implementations of this concept.
- Why Is OSPF (Or IS-IS) Afraid of Unequal-Cost Load Balancing
- Unequal-Cost Multipath in Link State Protocols
- Single-Metric Unequal-Cost Multipathing Is Hard
Finally, you can run OSPF over unnumbered interfaces, be it point-to-point serial links or Ethernet segments:
- Configure OSPF on Unnumbered Interfaces
- Packet Forwarding and Routing over Unnumbered Interfaces
- Running OSPF over Unnumbered Ethernet Interfaces
- OSPF and ARP on Unnumbered IPv4 Interfaces
- OSPF ECMP with Unnumbered IPv4 Interfaces
Rants
Now and then, I couldn’t resist writing an OSPF-related rant:
- Routing Protocols: Perfect Example of RFC 1925 Rule 5
- Stop Googling and Start Testing
- Is OSPF Unpredictable or Just Unexpected?
- Link-State Routing Protocols Are Eventually Consistent
- Building a Small Network with ChatGPT
- Why Is OSPF (and BGP) More Complex than STP?
What Others Are Writing About OSPF
- What I've Learned About Scaling OSPF in Data Centers
- Redistributing Full BGP Feed into OSPF
- OSPF Watcher
Other OSPF Blog Posts
- MPLS Traffic Engineering myths
- WAN Routing in Data Centers with Layer-2 DCI
- Implementing Control-Plane Protocols with OpenFlow
- Routing Protocols on NSX Edge Services Router
- BGP or OSPF? Does Topology Visibility Matter?
- Generating OSPF, BGP and MPLS/VPN Configurations from Network Data Model
- Use VRFs for VXLAN-Enabled VLANs
- FRRouting RIB and FIB
- FRRouting Loopback Interfaces and OSPF Costs
- Must Read: OSPF Protocol Analysis (RFC 1245)
LISP
LISP is a networking technology that has been searching for a relevant problem for a decade and a half (the LISP IETF working group started in the spring of 2009). Initially, I was cautiously optimistic. However, as LISP pivoted from an IPv6-over-IPv4 solution to a multihoming solution, then VM mobility and IP endpoint mobility solution, until it finally landed in Cisco Campus BU as the foundational technology of Software-Defined Access, I lost all hope.
LISP started as a DNS-like cache-based packet forwarding technology. Eventually, reality intervened, and the LISP believers rediscovered the flaws of cache-based forwarding. It looks like LISP pivoted to become a topology-driven PUB-SUB protocol. Assuming that’s correct, there’s little conceptual difference between LISP and EVPN. It’s just a question of defining a suitable set of policy mechanisms and developing an optimal implementation.
Discussing the benefits and drawbacks of LISP or EVPN thus makes as much sense as debating the number of angels dancing on the head of a pin, but that has never stopped people from doing one or the other.
Just in case you want to know more, you will find some details in the LISP-related blog posts I wrote since 2010:
- Introduction to LISP (2010)
- VXLAN, OTV and LISP (2011)
- We Just Might Need NAT66/NPT66 (and Not LISP) (2011)
- Networking Tech Field Day #3: First Impressions (2012)
- Mobile ARP in Enterprise Networks (2012)
- Hot and Cold VM Mobility (2013)
- VXLAN and OTV: The Saga Continues (2014)
- Why Is Cisco Pushing LISP in Enterprise Campus? (2017)
- When All You Have Are Stretched VLANs... (2020)
- Grasp the Fundamentals before Spreading Opinions (2020)
- Packet Forwarding 101: Header Lookups (2022)
- Cache-Based Packet Forwarding (2022)
- Repost: LISP Is a False Economy (2022)
- Should We Use LISP? (2022)
- So-Called Modern VPNs: Marketing and Reality (2022)
- Multihoming Cannot Be Solved within a Network (2022)
- LISP vs EVPN: Mobility in Campus Networks (2024)
- Repost: The Real LISP Mobility Use Case (2024)
- Repost: State of Lisp Implementations (2024) (2024)
high availability
ChatGPT explaining application high availability to a high school kid
Before going into the details, it’s worth figuring out what the application (or system) users need as opposed to what they think they need:
- Fifty Shades of High Availability (2020)
- Figure Out What the Customer Really Needs (2017)
- Are Business Needs Just Excuses for Vendor Shenanigans? (2020)
- Redundancy Does Not Result in Resiliency (2017)
- High Availability Planning: Identify the Weakest Link (2016)
- Meaningful Availability (2020)
- Differential Availability (2020)
Not surprisingly, IT vendors sell magic infrastructure solutions as the high-availability panacea based on the assumption that redundant infrastructure cannot fail. Nothing could be further from the truth:
- High Availability Fallacies (2011)
- If Something Can Fail, It Will (2012)
- How Hard Is It to Think about Failures? (2016)
- This Is What Makes Networking So Complex (2013)
- Decide How Badly You Want to Fail (2019)
- Sometimes You Have to Decide How You Want to Fail (2015)
- Some People Don’t Get It: It Will Eventually Fail (2016)
- The Network Is Reliable and Other Stories (2016)
- Circular Dependencies Considered Harmful (2021)
High Availability Concepts, Technologies, and Solutions
You can use a plethora of approaches depending on your availability targets:
- Disaster recovery is the right tool for the job if you’re OK with the system being down for a few hours.
- Automatic restart of application instances combined with disaster recovery is acceptable if you can accept your system to be down ~0.1% of the time (99.9% availability)
- Availability targets higher than 99.9% can only be reached reliably with proper application design supported by well-designed infrastructure.
I wrote over 130 blog posts on these topics. It would be impossible to list all of them on a single page; major high-availability technologies or concepts thus have dedicated pages:
- Disaster recovery and avoidance
- High availability clusters
- Public and private cloud deployments
- Global and local load balancing with IP anycast
One of the prerequisites for highly available services is also redundant networking infrastructure:
- Redundant Data Center Internet Connectivity – Problem Overview (2013)
- Redundant Data Center Internet Connectivity – High-Level Design (2013)
- Coping with Byzantine Routing Failures (2014)
- Site and Host Multihoming (2023)
- High Availability Switching (2024)
Regardless of your approach, the only sustainable way to get highly available services is the correct design of the application stack. For more details, watch the Designing Active-Active and Disaster Recovery Data Centers webinar; I also wrote a few blog posts on the topic:
- Swimlanes, Read-Write Transactions and Session State (2017)
- Solving the Problem in the Right Place (2017)
- Moving Complexity to Application Layer? (2017)
- Optimizing the Time-to-First-Byte (2021)
Notable Outages
Finally, here are a few notable outages. TL&DR: it can happen to the big guys and will eventually happen to you.
Other High Availability Blog Posts
- 2015
- 2014
- 2013
- 2012