IPv6 Neighbor Discovery exhaustion attack and IPv6 subnet sizes
A few days ago I got an interesting question: “What’s your opinion on the IPv6 NDP exhaustion attack and the recommendation to use /120 instead of /64?”
I guess we all heard the fundamentalist IPv6 mantra by now: “Every subnet gets a /64.” Being a good foot soldier, I included it in my Enterprise IPv6 webinar. Time to fix that slide and admit what we also knew for a long time: IPv6 is classless and we have yet to see the mysterious device that dies in flames when sniffing a prefix longer than a /64.
Before you rush out and change all your 64s prefixes to 120s, a few words of caution:
- You have to use /64 prefixes on subnets running SLAAC.
- I wouldn’t be surprised if some host stacks would be broken enough to die when faced with a local prefix not equal to /64. Conclusion: use /64 on all subnets to which workstations are attached.
- Likewise, I wouldn’t expect consumer CPE vendors to understand IPv6 can be classless. As above, use /64s in consumer environment.
- If a layer-3 forwarding device breaks down when having a prefix longer than /64 in its IPv6 routing table, throw it away.
Jeff Wheeler proposes to use /120 on all (data center?) subnets. I never tested this idea in practice and have no clue whether common server operating systems (Linux, Windows) would work with static IPv6 addresses out of a /120 prefix. Real-life experience? Please write a comment!
As a precaution against yet-to-be-discovered bugs, you could decide to use a single /120 prefix out of a /64 prefix on server-facing subnets (if the /120 prefix fails, you can easily go back to a /64 prefix without renumbering anything else but the affected subnet).
Alternatively, you could decide to be on the safe side, use the /64 prefixes on server subnets, assign static IPv6 addresses with high bits set to zero to servers (for example, use only 2001:DB8:C001:BABE::2/64 through 2001:DB8:C001:BABE::FE/64) and deploy inbound access lists on the L3 switches dropping packets sent to IPv6 addresses outside of that range.
Last but definitely not least, using /64 prefixes on point-to-point core links (and being exposed to script kiddies) is ridiculous. Juniper formalized this line of thinking with a standard-track RFC, recommending /127 prefixes on point-to-point links. And once you leave the 64-everywhere dogma behind, you can make the final step and allocate /128s to loopback addresses (I’ve tested this in Cisco IOS – works like a charm). Welcome back to the VLSM world.
It must be understood that ACLs may not protect your device. There are major-vendor boxes that will still at least reach their NDP policer (if not actually learn ND entries) when receiving packets with new source addresses on a locally-attached subnet, even if the packet will ultimately be discarded due to an ingress ACL on that interface. Operators should test their routers, because vendors absolutely do not provide reliable answers in this area -- including, again, the "big vendors" who we generally expect to Do The Right Thing.
It is also worth mentioning that this absolutely will break IPv4 on many dual-stack routers. Most people who think they may be ready for IPv6 today are not, and this is only one of many reasons why. We need to do a better job of asking our vendors for needed improvements before we are all forced to play catch-up.
"ACL won't protect you" - amazing how broken things can get. I always assumed input ACL is the very first thing checked by a L3 device. Am I right in assuming that hitting this particular bug would require the attack to be an inside job (pwned server) ... or the attacker targeting your WAN link?
"This will break IPv4 on many dual-stack routers" - just to clarify for everyone else reading the comments: I'm assuming you're saying NDP exhaustion attack also breaks IPv4 on those devices that use common v4/v6 L3 adjacency entries. Using /120 on a subnet should not impact IPv4 at all.
If you are running any dynamic routing protocol routes are pointing to link-local addresses of neighbor interfaces anyway - so assigning global IPv6 address to router interface is for troubleshooting purposes only - and here comes loopbacks - but problem is that with loopbacks only you don't see exact egress interface, but always loopback of the router. This might suck a bit in some corner cases.
Use with caution (as everything else) if in doubt :)
LLA are a pain if you're trying to figure out the exact path across the network with traceroute.
Also, you might not be able to do hop-by-hop telnetting with LLA if your IGP breaks down (not that telnet to LLA would not work, sometimes you don't have your neighbor's LLA in your ND cache).
In my opinion implementing IPv6 must not become a burden for network admin, so I try to implement IPv6 in easier way for me. Currently implementing dual-stack network, I try to match IPv6 address assignment with existing IPv4, so it will be easy for admin to know which is which. Before adding IP6 PTR record, all traceroute looks cryptic to me, so by doing that I can compare whether my network is working properly.
I also try to take advantages of IPv6: /64 is huge, so no need for renumbering and change subnets such as IPv4. I only need to remember 3 kind of allocation: /32 or /48 for one organization, /64 for LAN subnets, and /128 for loopback. That's it.
For the internet service its even more draconian, and they only permit /48's. There is a table on wikipedia thats kept fairly up to date regarding the ipv6 routing policies of many large carriers.
There is alot of speculation about IPv6 but little hard documentation. There are no good reference designs for a global enterprise. There are routing symmetry issues revolving around security and other services that are today the network is providing. Other technology like network-based IDS/IPS, data loss prevention, and web content filtering fall apart quickly as well.
We are activly deploying IPv6 in a global infrastructure and facing many serious issues. Neither Cisco, nor ATT nor our other vendors and carriers (even with their professional services groups) have good answers to offer at this time for all of the core issues that still exist with ipv6, and they all disagree on what the best practices are. AT&T wants /64's while Sprint wants 127's. Its all just the tip of a very large ice berg.
Enforcing prefix lengths in an MPLS/VPN network is plain stupid (or maybe your SP bought those mysterious boxes that self-destruct on receiving a longer prefix). The SP should not enforce the content (including prefix lengths), but just the maximum number of prefixes accepted from a site or total # of prefixes in a VRF.
Routing symmetry between private and public networks across firewalls ... nightmare! Right now we're working with a customer with similar issues and will probably make it work, but it will be way more complex than NAT would have been.
But how are we supposed to implement this if after 10+ Years there is still no rock solid standard?
That said, there's no reason not to use /127s on point-to-point links. If a device doesn't support /127s, it's a vendor problem, not a design problem.
Given the world they operate in today, until the hardware is completely refreshed over a period of years, those restrictions are probablly going to remain in place. These are not small and stupid telco's either, including AT&T.
@nosx: iACLs don't work on all platforms (see Jeff's comment below). Unadvertised PA space definitely helps.
Slowly we are refining IPv6 myths and best practices. IMO this makes IPv6 world more sane (/64 loopback...?!) and less fundamentalist.
If a "close friend" has layer-2 access to your switch -
you are in trouble w/o IPv6.
all layer-2/3 switch security must de used and/or adopted for IPv6 too.
like for nowdays v4 networks - 802.1x/ARP-Inspect/DHCP-Snoop
the RA Guard is still for Cat6500 only ;(
Moreover, I'd say SLAAC is good for PoC labs - or maybe you have a
SLAAC/WinXP solution; otherwise for usual static & dhcp setup you'd be able to
protect your networks
BTW
Since IOS 12.3T I have been using /128 for loopbacks
Not actually answering your request :-) I did a quick lab on a 2901 with 15.1(3)T and a Win7 on the other end. The DHCP-server implementation at the Cisco end gives out addresses with /120 defined in the address prefix section and managed-config flag defined towards the Win7. ND/RD works fine, pings normally, did a wireshar cap at W7 everything goes by the book.
No idea what happens with W2003 or W2008 but I would suppose they'd work fine too.
Excellent blog, keep up the good work.
Today, when we would have to start serious IPv6 deployments, we're faced with "what do you mean you don't have feature XXX in IPv6" revelations.
I wasn't able to get more than ~250 incomplete entries in the neigbour cache.
At the same time learning new, valid ND, entries didn't seem to be an issue.
So although it's a problem in theory, it seems that most implementations, i.e. Cisco (as described by Strech) and Juniper limit the effects of such an attack.
Note RFC 3627 http://tools.ietf.org/html/rfc3627 section 5, the part about u/l bits being zero. I'm not sure how likely those bits are to get used, I'm tracking IPv6 but not super-closely.
What about just using LLs for the infrastructure links?
The other issues are possible fragmentation attacks/path mtu poison at end nodes?
Checksum related attacks, IPv6 forces for UDP etc, router resource issue?
Header extension stacking processing?
Scaling at the application to asic level. IPv6 uses 128 bit addresses these have to be split for 64 bit architectures = added cpu cycles vs. IPv4 S/D addresses fit nicely into one 64bit word.
and more to come.
/128 are not announced as such by OSPFv3.
If I do remember correctly the /128 loopbacks are advertised as stub networks by OSPFv3 are advertised as /64... I am quite but not 100% sure for the /64 but not /128 for sure !
a. in order to have a sane addressing plan, we've decided long ago to allocate /64 subnets
b. SLAAC is desirable sometimes (workstation LANs etc)
(B) You mentioned SLAAC and data center switches in (almost) the same sentence ;)
thanks Ivan
I am just curious about the real potential of such attack.
When a resolution is performed with ND default values, a ND entry is created in
the state INCOMPLETE and a NS is sent. If no NA reply is received after
RetransTimer milliseconds (default: 1 second) it should then retransmit a NS
maximum MAX_MULTICAST_SOLICIT (default: 3) times. Then the entry is cleared from
the cache.
So the entry will not stay in the table more than 3 seconds before it is cleared.
For sure if an attacker keep on scanning, it will fill the table faster than the
table will be purged. But it will take some time to fill up the table
and the attack must be quite continuous without interruption or entries will be
deleted automatically.
This means that it should not be difficult to detect and to isolate the attacker.
If it comes from the outside it must pass firewalls which should be able to
manage this and take appropriate action at least to mitigate so it will not be
able to do much harm if it cannot block it.
If it is local, an IDS capable of detecting port scan and other attacks should
also be able to isolate the attacker.
So is it really such a big threat ?
Fred
Firewalls should be able to protect you if they allow access only to specific IPv6 addresses. If you use something along the lines of "permit tcp any any eq 80" you're toast.
Hi Ivan,
One small doubt. Current, we have device with ipv6 prefix 64 and facing neighbor cache exhaustion while generate TCP/IPv6 syn attack from one of the device (using netwox simulation tool for send syn packets from random sources).
If we assign ipv6 prefix 112s or 120s, somehow we could able to resolve this cache exhaustion and legitimate user able to access our device properly.
So, Is there any other way to resolve this neighbor cache exhaustion instead of reducing subnet size from 64s to 112s/120s & add one router prior to our device for accept specific source subnets & restrict remaining. ?
Thanks Kumar
The easiest way to solve that challenge would be with ingress access lists on the switch/router.