Can Virtual Routers Compete with Physical Hardware?
One of the participants of the Carrier Ethernet LinkedIn group asked a great question:
When we install a virtual-router of any vendor over an ordinary sever (having general-purpose microprocessor), can it really compete with a physical-router having ASICs, Network Processors…?
Short answer: No … and here’s my longer answer (cross-posted to my blog because not all of my readers participate in that group).
While the software-only forwarding process can reach 200 Gbps or more on a multi-core Xeon server, you cannot get anywhere close to the pps-per-$ price point of equivalent hardware solution.
Before someone starts making list price comparisons, do keep in mind that when you buy a switch or a router from a mainstream manufacturer, you're not paying for the hardware, but (mostly) for software and support, as well as sales and marketing expenses. Hardware is usually less than 30% of the total costs (just look at gross margin from any major networking hardware vendor).
On the other hand, lower-speed routers use CPU-based forwarding anyway - replacing them with VM-based form factor (virtual router) is a no-brainer.
Finally, while it might make sense (from speed-of-deployment perspective) to use virtual routers, many NFV deployments I see today deploy virtual firewalls, protocol translation/termination, load balancers or DPI devices. The appliance version of these devices usually uses CPU-based forwarding anyway (potentially augmented by an internal switch to ensure traffic is distributed deterministically to multiple cores) - yet again making them a perfect fit for VM-based deployment.
The only good reason I found so far for hardware-assisted appliance functionality is RSA key exchange in SSL termination. This process is really slow when done in software, and can be done much faster on dedicated coprocessors.
For more details on NFV forwarding performance, register for my NFV webinar.
A recent (even five year old) asic-based vendor router is almost certainly using less power for the same traffic. This might not be as significant or matter that much if you don't need terabits of performance though.
I do think it's a shame that there's not a vendor (at least that I'm aware of) making a box which is a nice Xeon server with a fully-plumbed EZchip (one of the higher end ones capable of doing internet scale routing) with a decent SW plumbing layer on top. The software to do useful carrier internet routing on linux is finally getting complete enough you could consider deploying it.
The pluribus ones come closest of those I'm aware of, but have the downside of a second switch chip, and needing to run their OS.
As for Pluribus - if I got it right (and I have no idea, because they never got to the technical details in their Tech Field Day presentations), all they have in their proprietary hardware is extra 10GE lanes to the Xeon CPU, making it possible to do more than what you can squeeze on the PCI bus between Trident-2 and CPU. Obviously that advantage goes away the moment you deploy their SW on whitebox HW.
For the hardware the best I've seen is what I have in my personal lab, a 2 slot ATCA chassis with a Xeon box in one slot, and an EZchip based switch in the other. Of course this is all ancient kit, but current gen stuff does exist and is probably feasible.
Shame there doesn't seem to be any vendors really looking to roll this up, would be a fun thing to build.
* If you want to have reliable NFV deployment, you _SHOULD_ deploy VNFs (fancy names for VMs) on dedicated infrastructure and carefully manage the oversubscription;
* While software-based forwarding always incurs more latency than hardware-based forwarding, I don't think it matters then moment the traffic hits the first WAN link.
And for the sake of example, Juniper VSRX (Firefly) adds latency of 5-10 ms on very light loads. E.g. if it serves some lonely VoIP call late at night, quality will suffer.
Thanks for the data point - it will definitely come handy ;)
So, let's round and guess 1500 *new* TLS connections per core per second on a modern server. Modern Xeon servers have at least 16 cores...
Any application that is connecting and disconnecting that frequently is broken by design. The user experience would be terrible even without TLS overhead. Use keep-alive connections for HTTP, along with session resumption, and it really isn't a problem.
We run a mid-size SaaS application doing SSL termination on just four Xeon cores spread across two load balancing nginx instances. They hover around 15%, and that includes handshakes and bulk crypto.
CloudFlare, Google, Faceboox, etc. do NOT use hardware for SSL acceleration, because it just doesn't matter if you have HTTP keep-alive and session cache/ticketing enabled. I believe Google said turning on SSL increased their front-end server load by something like 2-3% overall.