It’s hard for me to admit, but there just might be a corner use case for split subnets and inter-DC bridging: even if you move a cold VM between data centers in a controlled disaster avoidance process (moving live VMs rarely makes sense), you might not be able to change its IP address due to hard-coded IP addresses, be it in application code or configuration files.
Disaster recovery is a different beast: if you’ve lost the primary DC, it doesn’t hurt if you instantiate the same subnet in the backup DC.
However, before jumping headfirst into a misty pool filled with unicorn tears (actually, a brittle solution with too many moving parts that usually makes no sense), let’s see if there are alternatives. Here are some ideas in exponentially decreasing order of preference:
Ever heard of DNS? If the application uses hardcoded addresses in its clients or between servers, there’s not much you can do, but one would expect truly hardcoded addresses only in home-brewed craplications … and masterpieces created by those “programming gurus” that never realized hostnames should be used in configuration files instead of IP addresses.
If your application is somewhat well-behaved, there are all sorts of dynamic DNS solutions that you can use to automatically associate server’s new IP address with its DNS FQDN. Windows clusters do that automatically, many DHCP servers automatically create dynamic DNS entries after client address allocation, and there are numerous Linux clients that you can use even with static IP addresses.
Host routes? For whatever reason some people think host routes are worse than long-distance bridging. They’re not – if nothing else, you have all the forwarding information in one place … and modern L3 switches use hosts routes for directly connected IP hosts anyway.
Automatic network-side configuration of host routes is mission impossible. Local Area Mobility (LAM) worked years ago, but was not supported in data center switches until Cumulus Networks reinvented it with Redistribute ARP.
… and the only other mechanism I could think of that wouldn’t involve loads of homemade scripting glue is OpenFlow-driven RIB modification supposedly working on Juniper MX routers.
Routing protocols? Routing protocols running on servers were pretty popular years ago (that’s how IBM implemented IP multipathing on mainframes). Instead of configuring hardcoded IP address on server’s LAN interface, configure it on a loopback, and run BGP between servers and adjacent ToR switches… and whatever you do, please don’t use OSPF. Some IBM mainframes were a single link failure away from becoming the core data center router.
Yeah, I know, a stupid solution like this requires actual changes to server configurations … and it’s so much easier to pretend the problem doesn’t exist and claim that the network should support whatever we throw at it ;)
Route Health Injection on load balancers? Same idea as server-side routing protocols, but implemented in front of the whole application infrastructure.
Assuming your application sits behind a load balancer and you’re doing a cold migration of all application components in one step, you can preconfigure all the required IP subnets in the disaster recovery site (after all, they’re hidden behind a load balancer) and rely on the load balancer to insert the publicly visible route to the application’s public IP address once everything is ready to go.
The universal duct tape – NAT. If the clients use DNS to connect to the servers, but the servers have to use fixed IP addresses, use NAT to hide server subnets behind different public IP addresses (one per site).
Obviously you have to move the whole application infrastructure at once if you want to use this approach or things will break really badly.
Apart from the usual NAT-is-bad and NAT-breaks-things mantras, there are a few additional drawbacks:
- Clients have to rely on changed DNS records as you cannot insert a host route into the outside network like a load balancer can with RHI.
- NAT devices usually don’t support dynamic DNS registration, so you have to change the DNS entries “manually”.
Virtual appliances? Duncan Epping proposed using vShield Edge as a NAT device.
While that idea didn’t sound so great when I wrote the original blog post, that’s how most overlay virtual networks implement overlay-to-physical gateways, and at least VMware decided to use BGP as the only routing protocol in this scenario.
Anything else? I’m positive I’ve missed an elegant idea or two. Your comments are most welcome … including those telling me why the ideas mentioned above would never be implementable.