Category: design
Tuning BGP Convergence in High-Availability Firewall Cluster Design
Two weeks ago Nicola Modena explained how to design BGP routing to implement resilient high-availability network services architecture. The next step to tackle was obvious: how do you fine-tune convergence times, and how does BGP convergence compare to the more traditional FHRP-based design.
Worth Reading: Resilience Engineering
Starting with my faking disaster recovery tests blog post Terry Slattery wrote a great article delving into the intricacies of DR testing, types of expected disasters, and resilience engineering. As always, a fantastic read from Terry.
Using BGP for Firewall High Availability: Design and Software Upgrades
Remember the “BGP as High Availability Protocol” article Nicola Modena wrote a few months ago? He finally found time to extend it with BGP design considerations and a description of a seamless-and-safe firewall software upgrade procedure.
Stretched VLANs and Failing Firewall Clusters
After publishing the Disaster Recovery Faking, Take Two blog post (you might want to read that one before proceeding) I was severely reprimanded by several people with ties to virtualization vendors for blaming virtualization consultants when it was obvious the firewall clusters stretched across two data centers caused the total data center meltdown.
Let’s chase that elephant out of the room first. When you drive too fast on an icy road and crash into a tree who do you blame?
- The person who told you it’s perfectly OK to do so;
- The tire manufacturer who advertised how safe their tires were?
- The tires for failing to ignore the laws of physics;
- Yourself for listening to bad advice
For whatever reason some people love to blame the tires ;)
Worth Reading: How Complex Systems Fail
I don’t remember who pointed me to the excellent How Complex Systems Fail document. It’s almost like RFC1925 – I could quote it all day long, and anyone dealing with large mission-critical distributed systems (hint: networks) should read it once a day ;))
Enjoy!
Disaster Recovery Faking, Take Two
An anonymous (for reasons that will be obvious pretty soon) commenter left a gem on my Disaster Recovery Test Faking blog post that is way too valuable to be left hidden and unannotated.
Here’s what he did:
Once I was tasked to do a DR test before handing over the solution to the customer. To simulate the loss of a data center I suggested to physically shutdown all core switches in the active data center.
That’s the right first step: simulate the simplest scenario (total DC failure) in a way that is easy to recover (power on the switches). It worked…
Must read: Shades of Lock-in
Gregor Hohpe published an excellent series on Martin Fowler’s web site focusing on various aspects of lock-in. If nothing else, you SHOULD read the shades of lock-in part, and combine it with my thoughts on lock-in in data center networking.
Worth Reading: Anycast DNS in Enterprise Networks
Anycast (advertising the same IP address from multiple servers/locations) has long been used to implement scale-out public DNS services (the whole root DNS system runs on massive anycast), but it’s not as common in enterprise networks.
The blog posts written by Tom Bowles should get you there. He started with the idea and described his implementation using Infoblox DNS.
Want to know even more? I covered numerous load balancing mechanisms including anycast in Data Centers Infrastructure for Networking Engineers webinar.
Redundant BGP Connectivity on a Single ISP Connection
A while ago Johannes Weber tweeted about an interesting challenge:
We want to advertise our AS and PI space over a single ISP connection. How would a setup look like with 2 Cisco routers, using them for hardware redundancy? Is this possible with only 1 neighboring to the ISP?
Hmm, so you have one cable and two router ports that you want to connect to that cable. There’s something wrong with this picture ;)
Disaster Recovery Test Faking: Another Use Case for Stretched VLANs
The March 2019 Packet Pushers Virtual Design Clinic had to deal with an interesting question:
Our server team is nervous about full-scale DR testing. So they have asked us to stretch L2 between sites. Is this a good idea?
The design clinic participants were a bit more diplomatic (watch the video) than my TL&DR answer which would be: **** NO!
Let’s step back and try to understand what’s really going on:
Know Thy Environment Before Redesigning It
A while ago I had an interesting consulting engagement: a multinational organization wanted to migrate off global Carrier Ethernet VPN (with routers at the edges) to MPLS/VPN.
While that sounds like the right thing to do (after all, L3 must be better than L2, right?) in that particular case they wanted to combine the provider VPN with Internet-based IPsec VPN… and doing that in parallel with MPLS/VPN tends to become an interesting exercise in “how convoluted can I make my design before I give up and migrate to BGP”.
Don't Base Your Design on Vendor Marketing
Remember how Arista promoted VXLAN coupled with deep buffer switches as the perfect DCI solution a few years ago? Someone took Arista’s marketing too literally, ran with the idea and combined VXLAN-based DCI with traditional MLAG+STP data center fabric.
While I love that they wrote a blog post documenting their experience (if only more people would do that), it doesn’t change the fact that the design contains the worst of both worlds.
Here are just a few things that went wrong:
Bathwater and Hyperscalers
Russ White recently wrote an interesting blog post claiming how we should not ignore any particular technology just because it was invented by a hyperscaler illustrating his point with a half-dozen technologies that were first used by NASA.
However, there are “a few” details he glossed over:
Real-Life Data Center Meltdown
A good friend of mine who prefers to stay A. Nonymous for obvious reasons sent me his “how I lost my data center to a broadcast storm” story. Enjoy!
Small-ish data center with several hundred racks. Row of racks supported by an end-of-row stack. Each stack with 2 x L2 EtherChannels, one EC to each of 2 core switches. The inter-switch link details don’t matter other than to highlight “sprawling L2 domains."
VLAN pruning was used to limit L2 scope, but a few VLANs went everywhere, including the management VLAN.
Decide How Badly You Want to Fail
Every time I’m running a data center-related workshop I inevitably get pulled into stretched VLAN and stretched clusters discussion. While I always tell the attendees what the right way of doing this is, and explain the challenges of stretched VLANs from all perspectives (application, database, storage, routing, and broadcast domains) the sad truth is that sometimes there’s nothing you can do.
In those sad cases, I can give the workshop attendees only one advice: face the reality, and figure out how badly you might fail. It’s useless pretending that you won’t get into a split-brain scenario - redundant equipment just makes it less likely unless you over-complicated it in which case adding redundancy reduces availability. It’s also useless pretending you won’t be facing a forwarding loop.