Nicolas Vermandé sent me a really interesting question: “I've been looking for answers to a simple question that even different people at Cisco don't seem to agree on: Is it a good idea to class IP traffic (iSCSI or NFS over TCP) in pause no-drop class? What is the impact of having both pauses and TCP sliding windows at the same time?”
Let’s rephrase the question using the terminology Fred Baker used in his Bufferbloat Masterclass: does it make sense to use lossless transport for elephant flows or is it better to drop packets and let TCP cope with packet loss?
It’s definitely not bad to randomly drop an occasional TCP packet of a mouse session – if you have thousands of TCP sessions on the same link and drop a single packet of one or two sessions to slow them down, the overall throughput won’t be affected too much ... and if you randomly hit different sessions at different times, you’re pretty close to effective management of a mice aggregate.
Elephants are different because they are rare and important (see also Storage Networking is Different and Does Dedicated iSCSI Infrastructure Make Sense?) – dropping a single packet of an elephant iSCSI session could affect thousands of end-user sessions (because the overall disk throughput would go down), more so if you’re using iSCSI to access VMware VMFS volumes (where a single iSCSI session carries the data of all VMs running on the vSphere host). Classifying iSCSI as lossless traffic class thus makes a lot of sense.
Going back to Fred Baker’s Bufferbloat presentation: he claims delay-based TCP congestion control (that you get with PFC) is the most stable approach (assuming the host TCP stack has a reasonable implementation that responds to delays).
Comparing the results of QoS policing (= dropping) versus shaping (= delaying) on a small number of TCP sessions supports the same conclusions. Here are the graphs Jeremy Stretch made for the Policing Versus Shaping article:
Throughput of four TCP sessions (+aggregate) with policing (packet drops)
Throughput of four TCP sessions (+aggregate) with shaping (packet delays)
Storage networks, iSCSI, FCoE and Data Center Bridging (including PFC) are described in the Data Center 3.0 for Networking Engineers webinar.