Nicolas Vermandé sent me a really interesting question: “I've been looking for answers to a simple question that even different people at Cisco don't seem to agree on: Is it a good idea to class IP traffic (iSCSI or NFS over TCP) in pause no-drop class? What is the impact of having both pauses and TCP sliding windows at the same time?”
OpenFlow is not exactly known for its quality-of-service features (hint: there are none), but as I described in the ProgrammableFlow Technical Deep Dive webinar NEC implemented numerous OpenFlow extensions in their edge switches and the ProgrammableFlow controller to give you a robust set of QoS features.
During a recent ExpertExpress engagement I got an interesting question: “could we do per-customer policing and shaping on an MX-80 if we want to offer VPLS services and have Q-in-Q encapsulation on customer-facing links?” As I have preciously little Junos/MX knowledge, it was time for the classic “I’ll get back to you” reply and some heavy research.
You probably know how hard it is to find in-depth information on an unknown platform running unfamiliar software. Fortunately, Doug Hanks (@douglashanksjr) sent me a review copy of his new Juniper MX Series book a while ago. It was time for some serious reading.
I knew Geoff Huston would have a great presentation, but his QoS presentation was even better than I expected. I don’t necessarily agree with everything he said, but every vendor peddling QoS should be forced to listen to his explanation of the underlying problems and kludgy solutions first.
Whenever the networking industry invents a new (somewhat radical) technology, bandwidth-on-demand seems to be one of the much-touted use cases. OpenFlow/SDN is no different – Juniper used its OpenFlow implementation (Open vSwitch sitting on top of Junos SDK) to demonstrate Bandwidth Calendaring (see Dave Ward’s presentation @ OpenFlow Symposium for more details), and Dmitri Kalintsev recently blogged “How about an ability for things like Open vSwitch ... to actually signal the transport network its connectivity requirements ... say desired bandwidth” I have only one problem with these ideas: I’ve seen them before.
I got a really interesting question from one of my readers (slightly paraphrased):
Is this a correct statement: QoS on a WAN router will always be on if there are packets on the wire as the line is either 100% utilized or otherwise nothing is being transmitted. Comments like “QoS will kick in when there is congestion, but there is always congestion if the link is 100% utilized on a per moment basis” are confusing.
Well, QoS is more than just queuing. First you have to classify the packets; then you can perform any combination of marking, policing, shaping, queuing and dropping.
Got this question a few days ago:
I have a large DMVPN network (~ 1000 sites) using variety of DSL, cable modem, and wireless connections. In all of these cases the bandwidth is extremely dissimilar and even varies with time. How can I handle this in a scalable way?
Hub-to-spoke QoS implementations in DMVPN networks usually use one of the following options:
When I was testing QoS behavior in MPLS/VPN-over-DMVPN networks, I needed a traffic source that could generate packets with different DSCP/IP precedence values. If you have enough routers in your lab (and the MPLS/DMVPN lab that was used to generate the router configurations you get as part of the Enterprise MPLS/VPN Deployment and DMVPN: From Basics to Scalable Networks webinars has 8 routers), it’s usually easier to use a router as a traffic source than to connect an extra IP host to the lab network. Task-at-hand: generate traffic with different DSCP values from the router.
I got a great question in one of my Enterprise MPLS/VPN Deployment webinars when I was describing how you could run MPLS/VPN across DMVPN cloud:
That sounds great, but how does end-to-end QoS work when you run IP-over-MPLS-over-GRE-over-IPSec-over-IP?
My initial off-the-cuff answer was:
Well, when the IP packet arriving through a VRF interface gets its MPLS label, the IP precedence bits from the IP packet are copied into the MPLS EXP (now TC) bits. As for what happens when the MPLS packet gets encapsulated in a GRE packet and when the GRE packet is encrypted… I have no clue. I need to test it.
The last (and the least popular) Data Center Bridging (DCB) standard tries to solve the problem of congestion in large bridged domains (PFC enables lossless transport and ETS standardizes DWRR queuing). To illustrate the need for congestion control, consider a simple example shown in the following diagram:
Two weeks ago I wrote about the challenges you’ll encounter when trying to implement end-to-end QoS in an enterprise network that uses MPLS/VPN service as one of its transport components. Most of the issues you’ll encounter are caused by the position of the user-SP demarcation point. The Service Providers smartly “assume” the demarcation point is the PE-router interface… and everything up to that point (including their access network) is your problem.
A while ago John McManus wrote a great DSCP QoS Over MPLS Thoughts article at Etherealmind blog explaining how 6-bit IP DSCP value gets mapped into 3-bit MPLS EXP bits (now renamed to Traffic Class field). The most important lesson from his post should be “there is no direct DSCP-to-EXP mapping and you have to coordinate your ideas with the SP”. Let’s dig deeper into the SP architecture to truly understand the complexities of this topic.
We’ll start with a reference diagram: user traffic is flowing from Site-A to Site-B and the Service Provider is offering MPLS/VPN service between PE-A and PE-B. Traffic from multiple customer sites (including Site-A) is concentrated at SW-A and passed in individual VLANs to PE-A.
Data Center Ethernet (or DCB or CEE, depending on who you are) is a hot story these days and it’s no wonder that misconceptions galore. However, when I hear several CCIEs I highly respect talk about “Priority Flow Control can be used to stop all the other traffic when storage needs more bandwidth”, I get worried. Exactly the opposite is true: you use PFC to stop the overzealous storage traffic (primarily FCoE, but also iSCSI) to make sure you don’t drop it.
Enhanced Transmission Selection (ETS) is the second part of the Data Center Bridging puzzle (I’ve already described Priority Flow Control). It specifies two different technologies:
- Queuing mechanisms in bridges
- Data Center Bridging eXchange protocol: a Control/Negotiation protocol that allows bridges and hosts to negotiate QoS parameters in a bridged network.
Although some bridges from some vendors supported numerous QoS mechanisms in the past, 802.1Qaz is the first attempt to standardize a richer set of QoS behaviors than the strict priority queuing defined in 802.1p.
Yesterday I wrote that you don’t need DCB technologies to implement FCoE in your network. The FC-BB-5 standard is quite explicit (it also says that 802.1Qbb is the other option):
Lossless Ethernet may be implemented through the use of some Ethernet extensions. A possible Ethernet extension to implement Lossless Ethernet is the PAUSE mechanism defined in IEEE 802.3-2008.
The PAUSE mechanism (802.3x) gives you lossless behavior, but results in undesired side effects when you run LAN and SAN traffic across a converged Ethernet infrastructure.