Coming Full Circle on IPv6 Address Length
In the Future of Networking with Fred Baker Fred mentioned an interesting IPv6 deployment scenario: give a /64 prefix to every server to support container deployment, and run routing protocols between servers and ToR switches to advertise the /64 prefix to the data center fabric preferably using link-local addresses.
- We just turned 128-bit IPv6 addresses into 64-bit endpoint identifiers.
Do I have to mention that the original IPv6 proposal had 64-bit addresses, and they added the extra 64 bits to support IPX-style auto-configuration?
- Endpoint address is assigned to a node, not to an interface.
- Endpoints use a node-to-router protocol to advertise their endpoint address.
- All routers within a domain advertise individual endpoint addresses.
- Endpoint addresses are summarized into a larger prefix at the routing domain boundary.
Hooray, yet again we reinvented CLNP. We might have used it 25 years ago instead of inventing a new protocol.
Note: in case you’re still wondering what this IPv6 thing is all about, check out my IPv6 content.
Simple RIP should be enough for this purpose (advertise /64, receive a default)
We run dynamic ibgp/ebgp peering with servers downstream on both IPv4/IPv6 on tors .
Dynamic bgp on ipv4: /26
Dynamic bgp on IPv6 : /64
We don't use link local for this IPv6 peering . We use global addressing for this dynamic range peering.
Neighbor 2001::/64 sample config
Backend to the servers : dynamic IPv6 ibgp peering for /64. The rest /64 would be eui64 format.
Bgppeering on servers runs on static configuration because range cannot form neighborhsip on both sides.
Link local is fine but we need dynamic bgp peering and hence needed global addressing and server address would be in the same range as on th vlan /64 address.
Seamless running on bgp.
Link local is link specific and may not be useful for dynamic peering.
In basic (probably, most common) applicable scenario one does not even need any routing protocol - the ToR switches can be configured with /64 static routes pointing to servers, which, in turn, have static link-local addresses. The ToR does summarize /64's into shorter block, and so on.
The major benefit is being able to allocate IP per process/container/etc. I think one of the Google's paper was open to admit that going with IP per box for Borg and then juggling available ports per process/container was a major pain.
Does this affect dynamic bgp based multipathing .
Or we have static routes towards control servers and load share it so that actual data content balancing happens on control to data servers.
And also static routes pointing to link local address is bit tedious as it requires the nd cache population to receives servers link local address which would be fe80::/10 and adds 48 bits + FFFE format . It's best to have dynamic bgp peering to advertise content blocks upstream from bgp and form bgp peering with control servers
But that's a good point that you mentioned for bgp overhead which is always a separate control plane component .
How many servers are there per rack and how do you provision manual link local on servers starting with FE80/10 and the rest with your own addressing .