IPv6-only Data Center (built by Tore Anderson)

A while ago I wrote about uselessness of stateless NAT64 and got in nice discussion with Tore Anderson who wanted to use stateless NAT64 in reverse direction (stateless NAT46) to build an IPv6-only data center. Some background information first (to define the context of his thinking before we jump into the technical details):

  • He’s running a web hosting business;
  • He uses public addresses on all the hosted servers (even when they’re sitting behind a firewall or load balancer) to avoid NAT44 state, ease the maintenance and give his customers direct access to their servers;
  • He knows IPv6 is the future and has already deployed content on IPv6.
  • He’d love to eliminate as many intermediate migration steps and stateful devices as possible, preferably jumping straight to an IPv6-only data center.

Here’s his short and concise list of requirements:

  • Stateless network w/"line-rate" performance;
  • No symmetric routing requirements;
  • No loss of end-user IP address information on the servers (for geolocation and logging purposes);
  • Maximized IPv4 address conservation;
  • Minimum extra complexity. Avoid it on the servers and applications (ex: dual stack).

Initially it seemed like stateless NAT46 (SIIT) might be a perfect solution, but then Tore stumbled across an interesting roadblock (that caused me to declare stateless NAT64 useless) – NAT64 provides an algorithmic mapping between IPv4 address space and IPv6 addresses within a single prefix.

In an ideal world, you’d be able to write 1:1 static NAT statements mapping IPv4 addresses into IPv6 addresses (and vice versa), but as most vendors implement what RFCs say (and not what some creative users would like to see), the current implementation of stateless NAT64 on ASR1K does not have the functionality Tore needs.

Lack of functionality has never stopped a creative engineer, and Tore is undoubtedly one of the finest. He solved the 1:1 mapping issue with an interesting application of IPv6 host routes (yes, you can propagate /128 IPv6 prefixes in all IPv6 routing protocols):

  • Every server (or load balancer) has a LAN IPv6 address (belonging to the LAN /48 prefix) and an internal IPv6 host address (a /128 belonging to the NAT64 address range). Tore uses loopback interfaces in the servers and VIP addresses in load balancers.
  • He could use routing protocols to advertise the /128s straight from the servers (or load balancers using Route Health Injection), but decided to add a level of indirection (and keep servers/load balancers strictly separated from the network).
  • He’s configured static /128 routes for the “internal” IPv6 addresses (pointing to the server/LB LAN IPv6 address) on the first-hop routers and redistributes them into the rest of the network.

With his setup, he could deploy generic NAT46 on the edges of his network (using the same one-line NAT64 configuration on all edge routers) and use per-server routing throughout his network to push IPv6 traffic destined for the translated IPv4 address toward the right server. Not the easiest design to master at 2AM on Sunday morning, but probably simpler to maintain once you understand it than a hodgepodge of inter-protocol load balancers and/or dual-stack servers.

Note: His setup has two additional advantages: it works well with SSL (because stateless NAT46 does not touch the TCP layer or anything above it) and does not require HTTP header insertion (like X-Forwarded-For) usually used by SLB46 load balancers. You’re always able to deduce the source IPv4 address of the client from IPv6 server logs.

For more details, please view Tore’s presentation from RIPE64 meeting or download it in PDF format (where you’ll find the ASR1K configuration as well).

5 comments:

  1. So 'No symmetric routing requirements'?

    How is that possible with:?

    "even when they’re sitting behind a firewall or load balancer"

    What firewalls and (non dns) load balancers do not care about symmetry?

    ReplyDelete
  2. Will, stateless firewalls doesn't care about symmetric routing. Load balancers typically do, but they're close to the servers (which has state anyway).

    What I want to avoid is centralised state in the network itself - the data center network is shared between multiple customers, and I don't want any stateful devices in the shared infrastructure for reliability and performance reasons. That there's state kept inside a single customer's infrastructure is unavoidable, but if a DoS attack fills the state tables in a single customer's load balancer, other customers are unaffected. If the attack fills the state tables for a centralised NAT44 solution, *every* customer is affected.

    Tore

    ReplyDelete
  3. I'm not sure to understand why ASR1K is not able to do 1:1 static translation (NAT46)... This IOS XE command:

    nat64 v6v4 static 2001:ABCD:100::2 199.1.1.3

    should translate a destination 199.1.1.3 to 2001:ABCD:100::1 isn't ? I'm, for sure, missing something...

    Regards,

    ReplyDelete
  4. Martin B, in my testing i found that this command would do the static mapping I wanted. However, it invokes stateful mode - the source address will be mapped into the prefix defined by "nat64 prefix stateful", and all flows show up in the output of "show nat64 translations".

    I would very much like to see a feature that did the exact same thing for stateless mode. Adding the IPv4-translatable addresses to the servers, and static routes to them on the server's default gateways, importing them into the IGP ... I don't *like* this solution - it just works for a proof of concept test. For a production deployment, it would be much better (less hassle and complexity) to have a list of static mappings pushed to all the translators, and do nothing special with the servers or their access routers.

    Tore

    ReplyDelete
    Replies
    1. Thank you Tore. I didn't realize that stateful was involved with a static nat64 statement that should be "stateless only" from my point of view... I totally agree with you.

      Best regards,

      Delete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.