Will they ever start using their brains?
This morning I’ve discovered yet another journalistic gem. It started innocently enough: someone has announced prototype security software that blocks DDoS attacks. The fundamental idea (as explained in the article) sounds mushy: they’ve started with one-time user ID and introduced extra fields in the data packets. How can that ever scale in public deployment (which is where you’d be most concerned about a DDoS attack)?
But the true “revelation” came at the beginning of page 2: this software can filter bogus packets in 6 nanoseconds on a Pentium-class processor. Now let’s try to put this in perspective. A Pentium CPU operating @ 5 GHz can execute 30 instructions in 6 ns … or maybe not, it’s a CISC, not RISC design … or it might, due to parallel instruction execution. Never mind, in most cases you need more than that just to process the interrupt.
If you want to get more meaningless numbers, use this MIPS table. The highest-rated Pentium delivers just over 9 MIPS or less than 60 “average-sized” instructions in 6 ns.
But there’s more: if you want to process an incoming packet, you have to fetch it from the DRAM first. The best DDR2 DRAM on the market has more than 10 ns CAS latency (the time between CPU indicating what it needs and DRAM delivering the data). The article thus claims that this wonder software can reject packets faster than it can fetch them from the I/O buffer.
Modern NIC will interrupt once for a bunch of packets.
So, on average, it is possible to read packets in 6ns. But even with that your doubts are still relevant.
* read the buffer descriptor (it's invalidated from the CPU cache when the NIC flips the "available" bit).
* read the IP address fields (to perform lookup in the whatever-ID table on source IP address).
* read the checksum (they call it HMAC) in the packet.
You cannot read the IP address field until you know where the buffer is, so you have at least two CAS latencies (reading the checksum could be, in theory, overlapped with reading the IP address). 6ns is still science fiction ... unless, of course, you have a fantastic architecture, wildly fast CPU with lots of cache, multiple SDRAMs and multi-threaded hand-optimized packet rejector.
Might be interesting to understand the exact traffic profile and how this 30 nanosec was derived... I think you will "understand" where this 6ns figure came from.
Generally, people won't give any details to make sure no correlation is possible to find the hole in the theory