DHCPv6 over PPPoE: Total disaster

Every time someone throws me an IPv6 curveball, I’m surprised when I discover another huge can of worms (I guess I should have learned by now). This time it started pretty innocently with a seemingly simple PPPoE question:

What happens if an ISP decides to assign dynamic IPv6 subnets? With static assignment, the whole stuff is pretty straight-forward due to ND, RA & DHCPv6, but if dynamic addresses are used, what happens if the subnet changes - how will the change be propagated to the end-user devices? The whole thing is no problem today due to the usage of NAT / PAT...

LAN address allocation with changing DHCPv6 prefix is definitely a major problem, but didn’t seem insurmountable. After all, you can tweak RA timers on the LAN interface, so even though the prefix delegated through DHCPv6 would change, the LAN clients would pick up the change pretty quickly. WRONG ... at least if you use Cisco IOS.

The core of the problem is a total disconnect between IPv6CP and DHCPv6 and between dialer interfaces and PPPoE sessions.

Remember:

  • IPv6CP is not propagating IPv6 addresses;
  • DHCPv6 has to be used to get a /64 (or /56) prefix to use on the LAN interface of the CPE router;
  • Cisco IOS uses dialer interfaces to configure PPPoE client;
  • Dialer interface is always up, even if the underlying PPP session (which is bound to a virtual access interface) is not operational.

When the PPPoE session is established for the first time, the DHCPv6 client configured on the dialer interface sends a request (including IA_PD option) to BRAS and receives an IPv6 prefix that can be used on the LAN. Like any other DHCP allocation, the prefix has a lifetime that is usually measured in hours or even days.

If the PPPoE session is terminated for any reason (some ISPs, like the one I’m using, love to terminate PPPoE sessions every 24 hours just to annoy the users), the virtual access interface on BRAS goes down and the static route toward the DHCPv6-assigned prefix is gone. The DHCPv6 bindings on BRAS stay intact (so the CPE could reclaim the same prefix for a while).

However, the DHCPv6 client in the CPE router does not detect a link loss. While the virtual access interface does change state to down, the dialer interface doesn’t... and the DHCPv6 client is monitoring the dialer interface. The CPE router keeps using the old delegated prefix, which is no longer reachable (as the static route on BRAS is gone and the client did not send a renewal request yet).

Conclusion: the CPE router is stuck for the remaining duration of the DHCP lease unless you reset the DHCPv6 client manually with the clear ipv6 dhcp client interface command (which can be done with an EEM applet ... but try explaining that to an average user).

Workaround: You could assign fixed IPv6 prefixes to individual users through RADIUS, but then you’d have to propagate per-user /64 prefixes between the routers (at least within your POP).

More information: Some of the challenges of IPv6 core routing are described in the Building IPv6 Service Provider Core webinar (register here). The webinar attendees also get auxiliary materials, including numerous sets of tested router configurations and detailed BRAS configuration snippets.

9 comments:

  1. I have a strange feeling that you are trying to fix non-existent problem. Why on earth would you like to change IPv6 prefixes dynamically for the same user? I think this idea did not pop up until now, because there is absolutely no need for a mechanism like that - NAT and dynamic allocations tried to preserve v4 space, what's the excuse now? :)

    I understand someone will try to tie to legacy practice, but can't we educate people and hopefully prevent that?

    ReplyDelete
  2. Privacy ? I.e. long term user tracking ? I know I know many other ways this can be / is being done (above layer3).

    ReplyDelete
  3. Reuben Farrelly27 October, 2010 23:52

    http://ipv6.internode.on.net/configuration/adsl-cisco/

    Looks like the same problem. Does anyone have a Cisco DDTS bug ID for this or is this a "feature"?

    ReplyDelete
  4. It is great reading your blog and see what challenge you will encounter next. ;-)

    With regards to DHCPv6 Prefix Delegation. The problem we tried to solve was basically to replace a fax message from the ISP to the customer. The delegated prefix was expected to have a long lifetime, perhaps equal to the length of the contract the customer has with the ISP.
    In any case the prefix lifetime is independent of the state upstream link and should be valid until it's lifetime expires regardless.

    DHCPv6 PD does support changing prefixes, you can delegate two prefixes, an old and a new with different lifetimes and let the old time out over a few hours.
    It is not entirely trivial to "flash" renumber an IPv6 network. E.g. it will break existing sessions, ND doesn't let you set the valid lifetime to less than 2h, the user might have manually configured routers in a more complex network and so on...

    On link-state change on the WAN interface, the DHCPv6 client should confirm (through a renew message) that the prefix delegated is still valid. That it doesn't get the link state change through the Dialer interface does indeed sound like a bug.

    ReplyDelete
  5. Sorry for the off-topic post/request, but I have been unsuccessful trying to find a good authoritative guide that explains why having multiple VLANs within a subnet or multiple subnets within a VLAN segment doesn't make a whole lot of sense.

    I reached your blog looking for some bridging/routing differences and find your posts *extremely* insightful. Hence the thought that perhaps you could share some insight on the issue of subnets v/s VLANs to enlighten your readers! (Even if you/others happen to have some links to share, that'd be great too!)

    ReplyDelete
  6. Thank, Reuben. That's exactly the solution to use.

    I would say that at the moment it's a feature (that's how dialer interfaces work).

    ReplyDelete
  7. Thanks for the explanations. It's always nice to know the exact context in which a particular feature was designed.

    I was also not (yet) aware of the ND limits; they would indeed make a "flash renumber" (during which you'd obviously lose all the sessions like I do every morning at 10:30 when my beloved ISP clears my PPPoE IPv4 session) impossible.

    The DHCPv6 problem goes beyond the dialer interface. Carrier Ethernet access network with L2 switches has the same problems.

    ReplyDelete
  8. Some people did some math and you need 55.000 years to scan one /64

    Apart from that, we have privacy address mechanism and all sort of crypto generators to create IPv6 address...

    Come on...

    ReplyDelete
  9. Hey Ole,

    Pitty you can't make it to 4.th Slo IPv6 summit, you and Ivan could meet there :) Maybe we can do it for Spring summit, what do you say? :)

    I totally agree with "The delegated prefix was expected to have a long lifetime, perhaps equal to the length of the contract the customer has with the ISP. "

    This is how it is supposed to be done.

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.