We have a network with two data centers (connected with a DCI link). How could we ensure the applications in a data center stay reachable even if all local Internet links fail?
On the face of it, the problem seems trivial; after all, you already have the DCI link in place, so what’s the big deal ... but we quickly figured out the problem is trickier than it seems.
In the following short video I’m trying to explain what the problem is, and what a potential solution might look like. You'll find more details here.
Related blog posts
Todd Hoff wrote a great in-depth commentary of this video that you absolutely have to read.
And here are a few other relevant blog posts:
- High-level design document in ipSpace solutions corner
- Distributed firewalls – how badly do you want to fail?
- Long-distance vMotion for disaster avoidance? Do the math first!
- Is layer-3 DCI safe?
- Stretched layer-2 subnets – the server engineer perspective
- Layer-2 DCI and the infinite wisdom of ACM Queue