I made a flippant remark in a blog comment…
While it’s academically stimulating to think about forwarding small packets (and applicable to large-scale VoIP networks), most environments don’t have to deal with those. Looks like it’s such a non-issue that I couldn’t find recent data; in the good old days ~50% of the packets were 1500 byte long.
… and Minh Ha (by now a regular contributor to my blog) quickly set me straight with a lengthy comment that’s too good to be hidden somewhere at the bottom of a page. Here it is (slightly edited). Also, you might want to read other comments to the original blog post for context.
I don’t deny that small packets are not too much of an issue on a daily basis, but there’re several considerations that put that topic beyond academia. First one is quite obvious. To benchmark how well a router/switch performs vs the next one, a common ground needs to be used, and as traditionally routers are evaluated using their worst-case performance, small packets are useful for benchmarking.
Second, small-packet performance shows the basic capability of a router (let’s call them all routers as most of them do layer 3) for doing plain destination-based lookup. This basic capability deteriorates quickly as we start putting sophisticated features on, like security or QoS. Cisco QuantumFlow processor, when it came out, could handle about 8Mpps and degraded to 2Mpps as more features are activated. So it’s always good to know at least the baseline performance capability, so we can have a rough expectation of how well it will perform under stress.
I believe back in the early 2000s, one of Juniper platforms was known for ingress-to-egress-port out-of-order delivery of packets, under heavy traffic condition. All routers having buffered-crossbar architecture are susceptible to this of course, but the situation is caused when there’s massive backlog of cells in the VOQ and fabric, a situation that happens when the packet lookup/processing performance is less than optimal. So it’s always good to have a superior architecture that can handle packets smoothly, and that metric is reflected in how well a platform handles small packets.
Third, real-life traffic is not uniform, but tends to be long range dependent (LRD)/self-similar. Even Broadcom admits as much on page 5 of this presentation.
The nature of self-similarity/LRD can be due to heavy hitters/elephant-mice flow distribution, the nature of file distribution etc. TCP congestion control also contributes to LRD/self-similarity. With this kind of traffic, congestion spreading is reality, and weak-architecture routers will be killed. So high-performance routing platforms are always good to have. Self-similar traffic also makes big buffer mostly useless, but I digress ;).
I’m aware that sometime ago, there was an argument going back and forth about small packet performance needed or not – someone was nit-picking on Arista if I recall – and Arista’s response was something along the line of “we produce switches to meet our customer’s use cases, and by far none of them demands superior small-packet handling, so it’s basically a non-issue”. While on the surface this seems reasonable, when looking deeper, not so much, for all the above reasons. After all, if your router is so good, why wouldn’t it outperform competitors on the most basic of all benchmarks, packet forwarding? If yours is indeed supreme, then it doesn’t matter what benchmark right, you’d still outperform. If it can perform well with small packets, then I can sleep well buying it, knowing that I can trust its tenacity even in extreme or hostile conditions.
And speaking of hostile conditions, DDOS is another reason small-packet performance matters, as it translates directly to how much a router can take before it goes down. Of course there’re other measures to protect against DDOS, but if all vendors provide similar capability in those aspects, then what stands one apart from another would be packet-handling capacity, as that will decide who’s the last man standing in a heavy DDOS attack.
Sometime ago, I came across this paper from IBM (released in 2013) that surveyed over 30,000 servers, located across more than 50 production data centers, over a two year time span. Their findings, among other things, are that 80%+ of packets are 500 bytes or less, and at that, more small packets coming in than going out.
An excerpt from it about the packet size: “the average MTU is 1806.81 bytes due to the dominance of the traditional Ethernet MTU value equal to 1500 bytes as shown by the median MTU value. For the network load on each server, we see that: (i) the average server network traffic is roughly 1.16 Kpps and 5.7 Mbps; (ii) the average packet size is 300 bytes, and (iii) the weekend day network traffic is only slightly lower than the week day traffic.”
Re AWS router, I completely agree with you that AWS doesn’t need all the fanciful add-on features that Juniper (and other vendors) provides – that’s probably why they decide to do it themselves :)) . Vendors pack all those features on in an attempt to differentiate, and in order to charge higher for their devices. A lot of those features go unused, not just by Cloud providers, but the average enterprise/SP as well. By doing it themselves, AWS can use all the chip areas that otherwise would be wasted on unneeded stuff, to optimize for packet performance, which is mainly what the Cloud needs. While their routers are most likely not top-notch in quality due to their inexperience, for utility computing, that’s all they care and need, as you rightly pointed out.
At the end of the day, I feel networking (and IT in general ) has commoditized so much over the years – all the more so since the Cloud became mainstream – that we can expect to see more and more vendors compress or simply go away, and those who want to survive will have to reinvent themselves and differentiate their products instead of relying on the likes of Broadcom to provide chipsets/ASICs for them, essentially turning themselves into Broadcom resellers. Packing a lot of features that almost no one uses/needs is not the answer though 😜. There’re always markets for good and competitive products that solve the right problems, just like AMD has proven by going back from the brink of death several times after being dealt almost mortal blows by Intel, who’s feeling it more and more with each passing year as their loyal customers either go AMD, ARM or do it themselves, like Apple.
Oh, and I too did go thru the Broadcom material shared by Oleg. Most of it is about physical layer stuff, and there’s no treatment of the ASIC architecture, like the amount of CAM/TCAM, the traffic manager, the fabric scheduler, the crossbar architecture etc, the important stuff basically. Typical Broadcom… It’d be really unfortunate if the networking industry needs to rely on them to innovate on its behalf and provide vendors with the backbone of their products. Kinda reminds me of how the server/PC industry has stiffened for decades after Intel took center stage and eliminated most of the competition, making us stuck with their crappy CISC for a long, long time up until recent years.