The never-ending story of IP fragmentation

In the last few months I've run across a number of IP fragmentation issues, as you've probably noticed through my blog posts. I've also encountered a lot of misconceptions about IP fragmentation, its impact on GRE and IPSec as well as the fragmentation-related mechanisms, for example MTU Path Discovery. I hope that you'll find my January IP Corner article The Never-Ending Story of IP Fragmentation a good summary of the subject.

11 comments:

  1. I am a bit puzzled about some statments in the article. You write:

    or you could enable PMTUD for GRE tunnels with the tunnel path-mtu-discovery interface configuration command. When you enable the PMTUD on a GRE tunnel, the GRE packets are sent with the DF bit set and the router responds to the incoming ICMP destination unreachable messages with the reduction of the tunnel MTU size.


    On the other hand, you write:

    DF bit is copied from the source IP packet into the GRE envelope. If the source IP packet doesn’t have the DF bit set, it won’t be set in the outgoing GRE packet, potentially resulting in fragmentation of the GRE packet and expensive reassembly on the tail-end router.


    How do these two statements connect? Which combination of tunnel path-mtu-discovery and a DF flag in an incoming package causes DF to be set on the GRE package?

    ReplyDelete
  2. OK, let me try to rephrase:

    * If the tunnel path-mtu-discovery is not configured, all GRE packets are sent without the DF bit and thus fragmented if needed (and the receiving router falls back from CEF into process switching and dies a horrible death when the traffic load increases :( ).

    * If the tunnel path-mtu-discovery is configured, the DF bit is copied from the source IP packet into the GRE packet, triggering PMTUD if and only if the original packet looks like it could come from a PMTUD-aware source.

    ReplyDelete
  3. Hello,

    The content of the article is empty under Safari.

    ReplyDelete
  4. OK, thanks, that explains it.

    ReplyDelete
  5. I actually sent an email to what I thought was the web admin of the nil.com site that there are many formatting issues with Safari. Like the top navigation bar.

    ReplyDelete
  6. @danshtr & asyncra: I tell the relevant people know we have Safari issues. Is this the only article where you have problems or do you have problems with other IP Corner articles as well?

    ReplyDelete
  7. Hi Ivan

    I tried several articles, and none of them is readable

    ReplyDelete
  8. I've installed Safari 3 Beta for Windows XP (no Macs around here) to test the issue.

    The banner is definitely misplaced; it will be "fun" figuring out why, as it appears correctly in IE and FF. The blank article seems to be a timing issue (the JavaScript library is not loaded the first time you open the article) and will be fixed.

    As a workaround, I've got it to work by going BACK and FORWARD, the script was already cached on the second visit of the same page.

    ReplyDelete
  9. Whilst the general consensus seems to be that blocking ICMP, particularly unreachables, is braindead and/or the product of stupidity (see the opening paragraph of the article) it is worth bearing in mind that cisco itself is somewhat to blame for the unreliability of PMTUD by making it a tradeoff against speed on some of its platforms. See, for example, "Catalyst 6500/6000 Switch High CPU Utilization" (http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00804916e0.shtml#unreach) which says as the first "performance solution":

    "The drop of denied packets and generation of ICMP-unreachable messages imposes a load on the MSFC CPU. In order to eliminate the load, you can issue the no ip unreachables interface configuration command. This command disables ICMP-unreachable messages, which allows the drop in hardware of all access group-denied packets."

    I'm sure there are many environments, trying to squeeze performance out of their expensive 6500s who look at that and see it as an easy decision. If cisco puts that kind of thing in the slowpath, it is no wonder that the rest of the world says "who cares if PMTUD breaks". Interestingly, that document makes no mention of PMTUD, or breaking it, at all!

    ReplyDelete
  10. I agree with you that telling people to disable ICMP unreachables without an attached warning is "a bit" short-sighted.

    However, in most 6500 deployment scenarios it wouldn't hurt you, as the MTU is the same on all interfaces (unless you're terminating IPSec sessions or GRE tunnels on it).

    It's obvious why the ICMP unreachables are in slowpath - while it's possible to do everything in ASICs (after all, the CPU is just a large ASIC ;), building whole new IP packets out of existing payload in silicon would be "slightly" expensive ... and then everyone would complain how overpriced these boxes are :)

    ReplyDelete
  11. Vladimir Kocjancic14 January, 2008 13:57

    Issues with Safari browser are now fixed.

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.