Distributed Systems
Regardless of what anyone is telling you, reliable distributed systems aren’t simple. I would strongly recommend you take time and read at least some of these documents first:
- Fallacies of Distributed Computing
- Fallacies of Distributed Computing Explained
- Notes on Distributed Systems for Young Bloods
- CAP Theorem
- Byzantine Fault Tolerance
- Scaling Distributed Systems is Hard
- Consensus is Harder Than It Looks
- Modules, Monoliths, and Microservices
Fallacies of Distributed Computing
We covered the fallacies of distributed computing in How Networks Really Work webinar:
- Fallacies of Distributed Computing
- Network Is (Not) Reliable
- Latency Is (Not) Zero
- Bandwidth Is (Not) Infinite and Free
- Networks Are (Not) Secure
- Internet Has More than One Administrator
- Networks Are (Not) Homogenous
Distributed Systems in Software-Defined Networking
Wonder how these concepts apply to Software-Defined Networking? Any network is a distributed system, and when you add an SDN controller, it becomes a tightly-coupled distributed system. I explained the implications in a few blog post:
- State Consistency in Distributed SDN Controller Clusters (2021)
- Going Back to the Mainframes? (2015)
- Is Controller-Based Networking More Reliable than Traditional Networking? (2015)
- Controller Cluster Is a Single Failure Domain (2014)
- On SDN Controllers, Interconnectedness and Failure Domains (2015)
- Impact of Controller Failures in Software-Defined Networks (2019)
- Impact of Centralized Control Plane Partitioning (2021)
- More on Centralized Control and SDN (2015)
- We Need Consistency more than Controllers (2014)
- Fifty Shades of High Availability (2020)
Distributed Systems in Network Devices
You might also encounter distributed systems in high-end network devices. I described a few implementation gotchas in these blog posts:
- Non-Stop Forwarding (NSF) (2021)
- Stateful Switchover (SSO) (2021)
- Graceful Restart (GR) (2021)
- Big Picture: BFD, Non-Stop Forwarding, and Graceful Restart (2021)
- Non-Stop Routing (NSR) (2021)
- BGP Graceful Restart Considered Harmful (2024)
OpenFlow Controllers
Here are some older blog posts focusing on (now mostly obsolete) OpenFlow and OpenFlow-based SDN controllers. I’m including them here because I firmly believe we SHOULD learn from past mistakes:
- OpenFlow Fabric Controllers Are Light-years Away from Wireless Ones (2013)
- Centralized Control Is Not Centralized Control Plane (2015)
- Scalability of OpenFlow Control Plane Network (2016)
- As Expected: Where Have All the SDN Controllers Gone? (2019)
More Details
Need even more details? You’ll find them in these webinars (available with Standard ipSpace.net Subscription):