DHCPv6 relaying: another trouble spot?

My DHCPv6+PPPoE post received a very comprehensive comment from Ole Troan (thank you!) in which he explains the context in which DHCPv6 was developed (a mechanism to give a static IPv6 prefix to a customer) and its intended usage (as the prefix is static, it should have a very long lifetime).

However, when you deploy DHCPv6 in some modern access networks (it’s not just PPPoE, Carrier Ethernet fares no better), you might experience subtle problems. Let’s start with a step-by-step description of how DHCPv6 works:

  • CPE router reboots. IPv6 is configured on the outside interface. We can use link-local address or SLAAC (in which case two IPv6 prefixes are consumed per customer).
  • CPE router sends DHCPv6 request toward the PE-router. The DHCPv6 request includes the IA_PD option.
  • PE-router receives the DHCPv6 request and either allocates the requested IPv6 prefix from a local pool (in which case the prefix is dynamic and somewhat random) or forwards the DHCPv6 request to a central DHCPv6 server (DHCPv6 relay functionality).
  • In both cases, as the DHCPv6 reply goes back through the PE-router, the PE-router installs a static IPv6 route to the delegated IPv6 prefix. The next-hop is obviously the CPE router requesting the prefix.

My PE-router running IOS release 15.0(1)M did not insert the required static route when working as a DHCPv6 relay. DHCPv6 server functionality worked as expected.

So far, so good. Now imagine the PE-router reloads or its access Ethernet interface flaps. The PE-router loses all static routes to the CPE routers that were inserted in the IPv6 routing table based on DHCPv6 replies. However, the CPE routers assume everything is OK (in a typical mixed L2/L3 access network like the one shown below, a problem on one side does not result in a link loss on the other side) and try to renew the delegated prefix’s lease only when it’s about to expire. In the meantime, the customer has no IPv6 connectivity.

Fortunately, DHCPv6 implementation in Cisco IOS is pretty smart. When you use a local IPv6 pool on the PE-router, the PE-router rebuilds the static routes from the local DHCPv6 bindings. If you use a local pool and store DHCP bindings in a database, they would survive router reload as well.

It’s highly recommended to use ipv6 dhcp database to store the delegated prefixes.

But what if you decide to use a central DHCPv6 server and DHCPv6 relaying on the PE-router? How would that combination survive a link loss or a router reload? What am I missing?

Update 2012-01-19: DHCP Bulk Lease, available in Cisco IOS release 15.1(S) solves the state loss in DHCPv6 relays.

14 comments:

  1. What was DHCPv6 host-identifier in your test? DUID or something else?
  2. This whitepaper give a reasonable overview of the deployment options:
    http://www.cisco.com/en/US/prod/collateral/iosswrel/ps6537/ps6553/whitepaper_C11-472610.html

    A relay will snoop the PD options and should install static routes when a prefix is delegated through the relay. Another option we thought about was to use a new relay option (see the RAAN draft).

    Bulk lease query is also another way of recovering state at the PE.
    Markus and I wrote a draft at one point suggesting solutions to this problem:
    http://tools.ietf.org/html/draft-stenberg-v6ops-pd-route-maintenance-00

    in my view the 'cleanest' approach would be for the CPE to run some sort of "BFD echo" to discover lost forwarding state at the PE and reset its DHCP client state machine.
  3. Running 12.2(31)SB18, and same problem that static route not installed with DHCPv6 relay.
  4. TAC confirmed the issue 12.2(31)SB18 and I've asked a bug to filed.

    Frank
  5. Will you be able to shared the bug id? It would be nice to check whether it's a cross-release one (or whether the 15.0M behavior is a different bug).
  6. OK, so I've checked. This isn't a bug it is a feature. Or lack of such. I've escalated this internally and I expect that we will have full support for relay route injection in the latest branches soon. Currently it is in IOS XE, 12.2XN, 12.2SE.
  7. Thanks, Ole. You had told me offline that it was 12.2SRE, but above you write 12.2SE. Can you clarify?

    Frank
  8. Ole, thanks for the feedback!
  9. CSCtj94196 was opened for my issue.

    Frank.
  10. In PPPoE environments, it's common and usual for both the CPE and the BRAS to test the link with periodic PPP LCP Echo Requests/Replies.
  11. LCP Echo discovers L2 problems, not a missing static route on BRAS. There's also the "dialer interface" problem on Cisco IOS CPEs (see http://blog.ioshints.info/2010/10/dhcpv6-over-pppoe-total-disaster.html)
  12. I've tested 12.2(33)SRE2 on the 7609-S and 7206VXR and the DHCPv6-PD relay with static route insertion worked.

    Frank
  13. Synchronous thinking ... I did exactly the same tests last week, also with 12.2SRE2. Great results coming in tomorrow's blog post.
  14. Ivan:

    Can you test to see if static route insertion works with 12.2(31)SB20, 12.2(33)SRE3, or 15.1(4)M using PPPoE? Because it doesn't seem to work for me, and "debug ipv6 dhcp" and "debug ipv6 relay" and "debug ipv6 dhcp detail" show the external DHCPv6 server sending the PD, but there's no static route insertion. I've also used "debug ipv6 routing".

    Frank
Add comment
Sidebar