CB-WFQ misconceptions « ipSpace.net blog

Wednesday, November 4, 2009 06:58 CET

CB-WFQ misconceptions

Reading various documents describing Class-Based Weighted-Fair-Queueing (CB-WFQ) one gets the impression that the following configuration …

class-map match-all High
 match access-group name High
!
policy-map WAN
 class High
  bandwidth percent 50
!
interface Serial0/1/0
 bandwidth 256
 service-policy output WAN
!
ip access-list extended High
 permit ip any host 10.0.3.1
 permit ip host 10.0.3.1 any

… allocates 128 kbps to the traffic to/from IP host 10.0.3.1 and distributes the remaining 128 kbps fairly between conversations in the default class.

I am overly familiar with weighted fair queuing (I was developing QoS training for Cisco when WFQ just left the drawing board) and was thus always wondering how they manage to implement that behavior with WFQ structures. A comment made by Petr Lapukhov re-triggered my curiosity and prompted me to do some actual lab tests.

The answer is simple: CB-WFQ does not work as advertised.

To prove this claim, I’ve started two parallel TTCP sessions: one to IP address 10.0.3.1, the other to IP address 10.0.3.2. This is what show policy-map interface command displayed after a minute:

a1#show policy-map int ser 0/1/0
 Serial0/1/0

  Service-policy output: WAN

    Class-map: High (match-all)
      5996 packets, 3424386 bytes
      30 second offered rate 200000 bps, drop rate 0 bps
      Match: access-group name High
      Queueing
        Output Queue: Conversation 73
        Bandwidth 50 (%)
        Bandwidth 128 (kbps)Max Threshold 64 (packets)
        (pkts matched/bytes matched) 5981/3421234
        (depth/total drops/no-buffer drops) 4/0/0

    Class-map: class-default (match-any)
      516 packets, 270445 bytes
      30 second offered rate 6000 bps, drop rate 0 bps
      Match: any

The printout clearly demonstrates that the TCP session in the High class got way more than its allocated share while the TCP session in the class-default got 30 times less bandwidth.

Conclusion

The conversations in the class-default are treated as low-priority conversations and get significantly less bandwidth than other traffic classes.

29 comments:

Brettski 04 November 2009 08:15

I think that the conclusion here is that the "bandwidth percent" statement doesn't limit the amount of bandwidth available for that class, it simply guarantees that the class will receive 50% during times of congestion. During normal operation it is highly possible that the class will receive more than 50% of the available bandwidth. Certainly that behavior is backed up by experience. If you want to limit the class to 50%, then policing or shaping is the way to go.

ET 04 November 2009 09:52

Would a second class with "percent 50" and a match on any traffic get you equal balancing of traffic between 10.0.3.1 and 10.0.3.2?

class Others
bandwidth percent 50
!
policy-map WAN
class High
bandwidth percent 50
class Others
bandwidth percent 50

ip access-list extended Others
permit ip any any

Regards,
Erik

pav 04 November 2009 10:03

There is no congestion that is why you dont see any limitations. Qos just simply does not work at the moment. Queue is empty. Line can send traffic at offered rate.

John Kougoulos 04 November 2009 10:47

The main problem with CBWFQ was that you couldn't have fair-queue inside a class with bandwidth statement and you couldn't have bandwidth in class-default. I think these are resolved somewhere in 12.4(20)T with HQF but I haven't tested it.

I think that you can do the following now:
policy mypolicy
class Aclass
bandwidth percent 50
fair-queue
class class-default
bandwidth percent 50
fair-queue

Shawn 04 November 2009 12:38

Where's your drops!?

Ivan Pepelnjak 04 November 2009 13:42

Absolutely agree with you. My point was that a class with 'bandwidth' action can get WAY MORE than you'd expect UNDER CONGESTION and starve the default class.

Ivan Pepelnjak 04 November 2009 13:43

You're absolutely correct. The drawback is that you have FIFO queuing in the Others class (unless you use HQF).

Ivan Pepelnjak 04 November 2009 13:44

WRONG. If you look at the printout, you'll find that the depth of the HIGH queue is 4, so the congestion is there.

Ivan Pepelnjak 04 November 2009 13:45

You're absolutely correct. I was writing about HQF a few days ago and a comment Petr made on that post triggered this investigation.

Ivan Pepelnjak 04 November 2009 13:45

If you'd get drops on a WAN link running two parallel TCP sessions, something would be awfully wrong. You don't need drops to have a congested line.

metoo 04 November 2009 14:06

Did you take into account that by default you can only reserve 75% of the max bandwidth?

Ivan Pepelnjak 04 November 2009 14:11

Sure. That's why I've reserved 50% ... but it got almost 100% :-E

shivlu jain 04 November 2009 14:23

Amazed to see how the bandwidth allocation is more than allocated. I think Cisco is trying to put us in the dark and will come up with new solution after showing the limitation of previous one.
But the results and comment helped me to for my exam.

Thanks Ivan

Will 04 November 2009 17:01

Shouldn't you be using a nested service-policy and shape the interface down to 256K (Assuming that you have a 256K CIR) and then allocate your 50 percent from that?

Swap 04 November 2009 17:15

this restriction of unequal distribution for "classes without bandwidth keyword" has been fixed in IOS 12.4(22)T onwards which support HQF..on those IOS, regardless of configuration, class class-default in HQF images will always have an implicit bandwidth reservation equal to the unused interface bandwidth not consumed by user-defined classes.

So on 12.4(22)T onwards, CBWFQ has kind of inbuilt policer (somewhat like LLQ)

12.4(22)T onwards support Hierarchical Queueing Framework (HQF) feature.. In HQF images, flow-based fair-queues, configurable in both user-defined classes and class default with fair-queue, are scheduled equally (instead of by Weight). By default, the class-default class receives a minimum of 1% of the interface or parent shape bandwidth. It is also possible to explicitly configure the bandwidth CLI in class default.

few more details/results given on my blog -

http://eminent-ccie.blogspot.com/2009/09/qos-congestion-management-demystified.html

http://eminent-ccie.blogspot.com/2009/11/cbwfq-and-llq-revisited.html

cheers
Swap
#19804

Ivan Pepelnjak 04 November 2009 17:35

No, the interface was a point-to-point 256 kbps link (yeah, I know, the "clock rate" command was configured on the other side).

Ivan Pepelnjak 04 November 2009 17:37

Just a bit of a warning: there's a huge difference between "implicit bandwidth reservation" and "inbuilt policer".

Swap 04 November 2009 20:39

sorry Ivan, i cudnt get you.

My explanation was -
ON HFQ images i.e. 12.4(22)T and onwards, class "with bandwidth" keyword have an inbuilt-policer applied

and the class "without bandwidth" have implicit bandwidth reservation..

yes they are totally different as both are polar points and apply to different classes...

anything missed? you'll like to elaborate.

Thanks.

P.S: i'm lost in my empirical labs based on your MPLS architecture stuff for SP lab :)

Mike 04 November 2009 21:05

Does the fact that reserving any percentage gives it higher priority so the WAN queue is serviced more often have an effect causing the disparity?

Mike

Amit 05 November 2009 00:13

The demonstrated result is the correct behaviour. The "bandwidth" or "bandwidth percent" commands allocates the configured bandwidth for a class/queue. In case of more traffic or congestion, these classes will starve the class-default class, essentially making it a "sacrificial goat". This is CB-WFQ.

On the other hand, "priority" command will absolutely cut-off at the configured bandwidth but that makes it LLQ. 8-)

Brettski 05 November 2009 02:21

Yah. I dealt with that at one site by using a "police cir percent <bw> pir percent <bw>" style policy over several classes. That way, classes can exceed their minimum guaranteed bandwidth but can't consume the entire link.

C23 06 November 2009 00:12

I agree with you that class High can theoritically starve the default class...

...However if you ran 2 simultaneous TCP sessions during your test, after a few "moments", shouldn't the behavior of TCP (slow start, round trip time measurement, congestion window, and all the stuff like that... I'm not an TCP expert ;) ) equal balance the throughput of both TCP sessions?

Dan 06 November 2009 14:48

I would have tried this with UDP mode of TTCP

Bryan 06 November 2009 17:37

My understanding is that under congestion any remaining bandwidth after queuing allocations gets assigned proportionately. So of the remaining 128K, High would get 50%, and class-default would get an implied 50%. You can influence this with the command bandwidth remaining.

Petr Lapukhov 07 November 2009 11:50

Ivan,

Can't go past any QoS discussion :) Here is the summary of the "research" i did back in days on CBWFQ (classic)

http://blog.internetworkexpert.com/2008/08/17/insights-on-cbwfq/

Back then i tried to do some "reverse engineering" on CBWFQ, to show how exactly it adapts the classif WFQ weighting formula. Right now i'm trying to come with a similar writeup on HQF, though it significantly tougher - they hide all "implementation-dependent" details now.

As it can be seen, CBWFQ really treats class-default "poorly", even with fair-queue enabled. Cisco never clarified that misconcept, and I personally had some real-life issues with it :)

Steve 10 December 2009 16:57

Your original tested yielded exactly the results one might expect on a ISR or 7200 running code prior to 12.4.20T. The bottom line is that a class based queue in not guaranteed any significant bandwidth unless there is a bandwidth statement applied. You chose to forgo the minimum bandwidth guarantee to do WFQ in the default class, thus the default class was essentially serviced in a best effort manor, and the 10.0.3.2 flow received less bandwidth. Lesson learned hear is to be explicit in your configurations and don't always rely on defaults if you don't know the default behavior for a particular device.

Now as mentioned earlier, Cisco made a code change to the ISR/7200 QoS code starting in 12.4.20T finishing in 12.4.22T. This code change eliminated the WFQ option in the class-default and now allows for FQ to be enable in any non-priority class in conjunction with the bandwidth command. This change makes the queuing behavior more like the 7500, flexwan and SIP200. :-D

Ivan Pepelnjak 10 December 2009 17:19

Steve,

I agree with your "lesson learned", it's just another way of saying what I did. I was simply pointing out that the "common wisdom" on how CB-WFQ works (and what the default behavior is) is plain wrong, so I don't understand precisly why it seems you're somewhat cross with me.

As for HQF: if you browsed through the list of "related posts" (just below the article text), you'd find links to the HQF tests I did (and if you had read them, you'd have discovered I adore HQF).

Marc C 08 February 2011 06:53

Hi Ivan,

About the depth which is 4...what is the reference on the output why you mentioned it is congested?

Thanks a lot

Ivan Pepelnjak 08 February 2011 09:49

Last line of the "High queue" printout, the 'depth' value. It indicates there are 4 packets in the high queue waiting to be transmitted, so the congestion is obviously there.

Add comment

Conclusion

Recent posts in the same categories

QoS

29 comments: