A reader left the following comment on my Does Multipath TCP Matter blog post: “Why would I use MP-TCP in a data center? Couldn’t you use packet spraying at each hop and take care of re-ordering at the destination?”
Short answer: You could, but you might not want to.
Packet reordering can cause several interesting problems:
- There are (
badly writtenhighly optimized) applications running over UDP that cannot tolerate packet reordering because they don’t buffer packets (example: VoIP);
- Out-of-order packets reduce TCP performance.
- Out-of-order packets kill receive side coalescing.
Impact on TCP performance
According to this paper packet reordering causes (at least) these performance problems:
- TCP receiver sends duplicate ACK packets (to trigger fast retransmit algorithm), wasting CPU cycles and bandwidth;
- TCP sender reduces TCP window size after receiving duplicate ACKs (assuming packets were lost in transit), effectively reducing TCP throughput.
- TCP receiver has to buffer and reorder TCP packets within the TCP stack, wasting buffer memory and CPU cycles.
More information on this topic would definitely be most welcome. Please share in the comments section. Thank you!
Impact on TCP offload
On top of the performance problems listed above, packet reordering interferes with TCP offload (in particular with the receive segment coalescing functionality).
Receive segment coalescing is not relevant to traditional data center workloads (with most of the traffic being sent from servers toward remote clients), but can significantly improve the performance of elephant flows sent toward the server (example: iSCSI or NFS traffic). I don’t think you want to play with that, do you?
There are several really good reasons almost nobody does per-packet ECMP load sharing by default (Brocade being a notable exception solving the packet reordering challenges in their switch hardware).