Interesting links (2011-11-06)

The “discovery of the week” award goes to Terry Slattery for pointing out the dangers of bufferbloat while investigating TCP retransmissions (part 1 and part 2). BTW, in the end, he figured out it was just an overloaded Gigabit Ethernet linecard.

Two other interesting discoveries: PA /48 IPv6 prefixes are still filtered and BGP is more stable than we thought it would be.

Another treasure trove of great posts is the Networking Tech Field Day web site – just go to the Post-Event Blogs section and read them all; they're all worth it. Most amusing one: The Comic Edition. Another must-read: The Problem.

In the “My FUD is better than yours” department we have an MPLS-bashing white paper from Fujitsu (I decided dissecting it and writing a response is a waste of time and would only increase cosmic entropy) and BYOD Isn’t As Scary As You Think from Juniper.

Cloudy nets

James Urquhart writes about new network models (part 1, part 2, part 3). If you’re even remotely interested in cloud networking, OpenStack, Quantum, Nova and/or OpenFlow, these posts are a must-read.

The Dark Side of Clouds by Chuck Hollis is an excellent analysis of “we can build it better” syndrome. I’ve seen the same behavior too many times to find it amusing, but you obviously can’t change some people.

Data Centers

How proprietary is proprietary and when does it matter? Tom Hollingsworth and Tony Bourke offer two interesting perspectives.

Brass Tacks blog has another great review of supported FCoE connectivity options.


Jan Zorz wrote about an interesting USB modem he received from Nokia (hint: IPv6-enabled 3G modem). He still hasn’t realized that there are a few more English- than Slovenian-speaking people interested in IPv6, and thus his thoughts got a bit scrambled by Google Translate.


  1. For bufferbloat, I wish there was a better explanation to the behavior. TCP looks to fill the "pipe", not to maximize the bandwidth, and by adding more buffers we just increase the pipe depth.

    TCP is a "clocked" protocol, so in general the sender window opens upon reception of incoming ACK's. If the data segments that have been sent out are delayed due to buffering, so will be the returning ACKS's, effectively slowing down the CWND expansion and pipe filling. The only place you can go wrong is maybe slow start synchronization where multiple senders overfill the pipe due to exponential growth.

    Anyways I'll look around as so far I haven't found clear model of this behavior. Hey, we just tuned buffers up to reduce the impact of TCP incasting and now they tell us to shrink them back!
  2. Hi Petr,

    I'm sure you've seen the graph I posted in my blog few years ago ( which demonstrates the issue quite ... ugh ... graphically.

    I made sure it was the single TCP stream over this pipe, so what you see wasn't due to SS sync.

    As you can see from the graph, the TCP stream experienced extremely wild variations of RTT and sending rate. What was happening is that insane overbuffering prevented TCP from discovering an equilibrium sending rate - it allowed CWND to grow too high, get the huge buffer filled, and then go back to retransmission and shrinking CWND almost to zero. Basically it's a classic TCP sawtooth with extremely large tooth, making avg rate very poor.

    As to Terry Slattery's explanation regarding unnecessary retransmissions, I'm sure there were some (had to be with RTT well up into 10s of seconds!). But I'm not sure if there was enough of them to clog the pipe and exacerbate the issue.

    I still have the packet captures lying around and can share them if you're interested.
  3. The correct url for the graph is
  4. Oh man, it was silly of me to forget that TCPoDOCSIS blog post of yours :) Anyways, I feel I kind of get it :) TCP expands CWDM way too much, hugely overestimating the pipe "width" (=bandwidth) and then collapses dramatically when it hits the ceiling of buffer depths.

    Thanks, and please send me the packet caps once you get a moment to petrlapu at microsoft dot com!
  5. Fujitsu can´t be serious:

    The Ethernet-based connection-oriented Ethernet technologies—VLAN switching and PBB-TE—uniquely allow service providers to enjoy the deterministic performance, efficient aggregation and 50 ms protection

    What are those guy´s smoking?
  6. ...and some of my 2c on the bufferbloating: imagine that you have a VPLS service, where RTT between the sites is quite different, and one major (say, DC) site has to communicate with all the others. The problem with this situation, of course, is that there will be no "correct" output buffer size. And if I understand it correctly, the worst punished will be the closest sites (smallest RTT).

    Hmm.... Sure smells like an opportunity! :)
Add comment