My “Was it bufferbloat?” blog post generated an unexpected amount of responses, most of them focusing on a side note saying “it looks like there really are service providers out there that are clueless enough to reorder packets within a TCP session”. Let’s walk through them.
One of my readers opened another can of VMware vSwitch worms. He sent me this question:
If a VM were to set a COS value, would the vSwitch reset it to 0 as part of its process of building the dot1q header?
The nasty detail (as you probably know) is that 802.1p CoS value resides in the 802.1q (VLAN) tag.
The Mice and Elephants is a traditional QoS fable – latency-sensitive real time traffic (or request-response protocol like HTTP) stuck in the same queue behind megabytes of file transfer (or backup or iSCSI) traffic.
The solution is also well known – color the elephants pink (aka DSCP marking) and sort them into a different queue – until the reality intervenes.
One of my readers sent me an interesting question:
I have been reading at many places about "throwing more bandwidth at the problem." How far is this statement valid? Should the applications(servers) work with the assumption that there is infinite bandwidth provided at the fabric level?
Moore’s law works in our favor. It’s already cheaper (in some environments) to add bandwidth than to deploy QoS.
One of my readers wanted to deploy FCoE on UCS in combination with Nexus 1000v and wondered how the FCoE traffic impacts QoS on Nexus 1000v. He wrote:
Let's say I want 4Gb for FCoE. Should I add bandwidth shares up to 60% in the nexus 1000v CBWFQ config so that 40% are in the default-class as 1kv is not aware of FCoE traffic? Or add up to 100% with the assumption that the 1kv knows there is only 6Gb left for network? Also, will the Nexus 1000v be able to detect contention on the uplink even if it doesn't see the FCoE traffic?
As always, things aren’t as simple as they look.
A long while ago there was an interesting discussion started by Brad Hedlund (then at Dell Force10) comparing leaf-and-spine (Clos) fabrics built from fixed-configuration pizza box switches with high-end chassis switches. The comments made by other readers were all over the place (addressing pricing, wiring, power consumption) but surprisingly nobody addressed the queuing issues.
This blog post focuses on queuing mechanisms available within a switch; the next one will address end-to-end queuing issues in leaf-and-spine fabrics.
One of the usual complaints I hear whenever I mention overlay virtual networks is “with overlay networks we lose all application visibility and QoS functionality” ... that worked so phenomenally in the physical networks, right?
A while ago Tomasz Kacprzynski asked me whether I'd ever run RSVP over DMVPN. I hadn't - after all, you'd only need that in VoIP environments and I try to stay as far away from voice as possible.
In the meantime, Tomasz solved the problem (short summary: you have to turn Phase 3 DMVPN into Phase 2 DMVPN) and wrote a lengthy blog post describing the problem (RSVP split horizon rule) and his solution (including numerous debugging printouts). Definitely worth reading if there's a non-zero chance you'll have to get the two working together.
Nicolas Vermandé sent me a really interesting question: “I've been looking for answers to a simple question that even different people at Cisco don't seem to agree on: Is it a good idea to class IP traffic (iSCSI or NFS over TCP) in pause no-drop class? What is the impact of having both pauses and TCP sliding windows at the same time?”
OpenFlow is not exactly known for its quality-of-service features (hint: there are none), but as I described in the ProgrammableFlow Technical Deep Dive webinar NEC implemented numerous OpenFlow extensions in their edge switches and the ProgrammableFlow controller to give you a robust set of QoS features.
During a recent ExpertExpress engagement I got an interesting question: “could we do per-customer policing and shaping on an MX-80 if we want to offer VPLS services and have Q-in-Q encapsulation on customer-facing links?” As I have preciously little Junos/MX knowledge, it was time for the classic “I’ll get back to you” reply and some heavy research.
You probably know how hard it is to find in-depth information on an unknown platform running unfamiliar software. Fortunately, Doug Hanks (@douglashanksjr) sent me a review copy of his new Juniper MX Series book a while ago. It was time for some serious reading.
I knew Geoff Huston would have a great presentation, but his QoS presentation was even better than I expected. I don’t necessarily agree with everything he said, but every vendor peddling QoS should be forced to listen to his explanation of the underlying problems and kludgy solutions first.
Whenever the networking industry invents a new (somewhat radical) technology, bandwidth-on-demand seems to be one of the much-touted use cases. OpenFlow/SDN is no different – Juniper used its OpenFlow implementation (Open vSwitch sitting on top of Junos SDK) to demonstrate Bandwidth Calendaring (see Dave Ward’s presentation @ OpenFlow Symposium for more details), Greg Ferro was talking about the same topic in his fantastic Introduction to OpenFlow/SDN webinar, and Dmitri Kalintsev recently blogged “How about an ability for things like Open vSwitch ... to actually signal the transport network its connectivity requirements ... say desired bandwidth” I have only one problem with these ideas: I’ve seen them before.
I got a really interesting question from one of my readers (slightly paraphrased):
Is this a correct statement: QoS on a WAN router will always be on if there are packets on the wire as the line is either 100% utilized or otherwise nothing is being transmitted. Comments like “QoS will kick in when there is congestion, but there is always congestion if the link is 100% utilized on a per moment basis” are confusing.
Well, QoS is more than just queuing. First you have to classify the packets; then you can perform any combination of marking, policing, shaping, queuing and dropping.
2011-06-23: Added description of various link efficiency mechanisms.
Got this question a few days ago:
I have a large DMVPN network (~ 1000 sites) using variety of DSL, cable modem, and wireless connections. In all of these cases the bandwidth is extremely dissimilar and even varies with time. How can I handle this in a scalable way? Also, do you know of any product or facility that I can use to better measure the bandwidth from hub to spoke and better set the QOS values?
The last question is the easy part: one of the products that does that is NIL Monitor service where the remote probes can measure the actual end-to-end bandwidth. NIL Monitor software can also log into routers and change configurations if needed ... but what should you change?