Path MTU Discovery Doesn’t Work with IP Multicast
A friend of mine sent me an interesting problem:
I noticed recently that my IOS routers aren't sending ICMP (unreachable; frag needed) messages in response to too-big IPv4 multicast packets with DF-bit set. They're just dropping these packets silently, breaking PMTUD.
Unfortunately, that’s not a bug but a FAD (Functions-as-Designed).
A quick Google search found this document which pointed me to section 7.2 of RFC 1112 (yeah, multicast is really THAT old):
An ICMP error message (Destination Unreachable, Time Exceeded, Parameter Problem, Source Quench, or Redirect) is never generated in response to a datagram destined to an IP host group.
The same document also describes why RFC 1112 prohibits sending ICMP error messages in response to multicast datagrams. The processing done on ICMP error replies by the *nix socket API might block the sender socket if an error comes back from a single receiver or if TTL expires when traversing a particularly long branch of the multicast tree – not exactly a good idea in multicast environment.
Lessons learned:
- You should never get ICMP error messages in response to IP multicast packets;
- Path MTU discovery doesn’t work with IP multicast;
- Sending multicast packets with DF bit set is a bad idea unless you’re OK with some receivers never getting them;
- ICMP echo reply to a multicast echo request is perfectly legal (because it’s not an ICMP error message).
My multicast applications are mostly low rate and small packet size. Do you have any idea of average packet size for multicast applications such as IPTV? Or what kind of applications would this be a problem in?
- bad configured intermediate device might process switching due to fragmentation and reassembly and we'd like to avoid high cpu issues or latency on real time flows.
- any MTU config mistakes can be easily discovered (TV flow is not received),
The protocol implements simple mechanisms to handle duplicate data, out-of-order and lost segments.
-
Personally, I think that v4 got it wrong, and v6 got it right. PMTUD for multicast should be possible.
The concerns about too many unreachables (including when elicited by packets from spoofed sources) don't really resonate with me compared to the ugliness of needlessly fragmenting traffic. At the very least, this behavior should be configurable so that it can be used where appropriate.