Quick Peek: Juniper vMX Router
While the industry press deliberates the disaggregation of Arista and Cisco, and Juniper’s new CEO, Juniper launched a virtual version of its vMX router, which is supposed to have up to 160 Gbps of throughput (as compared to 10 Gbps offered by Vyatta 5600 and Cisco CSR). Can Juniper really deliver on that promise?
We know that it’s reasonably easy to get 50+ Gbps from an Intel server, with smart solutions going all the way to 200 Gbps. Looking at the vMX data sheet it looks like Juniper got a number of things right:
- They decoupled control and forwarding planes into separate VMs (they probably still have to run on the same physical host, but this makes it easier to dedicate CPU cores to individual functions);
- Virtual forwarding plane uses SR-IOV and Intel DPDK, so the performance claims aren’t totally unrealistic.
I also like their licensing approach: You’re buying licenses (perpetual or annual) in steps of 100 Mbps, 1 Gbps and 10 Gbps, which (I hope) means you can make the virtual router as fast (or as slow) as you need it to be, particularly if you can transfer the licenses between multiple vMX instances.
Will vMX bring the death of physical routers? It will definitely replace the physical routers in service provider environments that decide to use NFV (example: Deutsche Telekom Terastream), but then vMX was the only thing Juniper could do in those environments to compete with Cisco and Brocade.
I’m not so sure about the enterprise WAN edge. As Matt Bolick pointed out during the Tech Field Day @ Interop New York, everyone designs servers for a 2-year lifecycle, whereas the WAN edge routers typically have to survive 5-10 years, and I’m positive smart enterprise buyers know that (regardless of what they say at various Something-Open-Something events).
More Information
You’ll learn more about virtual routers, appliances, and NFV in my SDN/NFV/SDDC workshop (slide deck is already available to the subscribers, recordings will be published in 2014).
I suspect that the forwarding and control plane separation is similar to what they've been doing on the branch SRX line where they take a multicore processor and spin up a virtual asic on some of the cores and leave 1 or more for the control plane. This way they can share the same code as their boxes with real asics.
Here is an example from an SRX550 which has a 6 core processor in it. flow_octeon_hm is the virtual asic and has 5 cores ( and 466% CPU) running for packet forwarding. All the rest of the processes share the 6th core, things like rpd, vrrpd, bfdd, snmp etc. I trimmed the output a bit to get the more obvious processes near the top but you get the idea. That output probably looks familiar because it's from "top"
The control plane on SRX is much more isolated than in a classic IOS box. I could easily flap bfd+bgp on a 2951 but it was it was much harder to get the SRX to that breaking point when flogging it with an ixia. The flow_octeon_hm always shows because they have the process in a tight look looking for packets to forward.
last pid: 15007; load averages: 0.12, 0.11, 0.08 up 166+05:58:52 02:27:55
72 processes: 7 running, 64 sleeping, 1 zombie
CPU states: 85.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 15.0% idle
Mem: 180M Active, 136M Inact, 992M Wired, 156M Cache, 112M Buf, 509M Free
Swap:
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
1260 root 9 76 0 992M 56596K select 0 ??? 466.11% flowd_octeon_hm
1310 root 1 76 0 21072K 13528K select 0 999:08 0.00% snmpd
1255 root 1 76 0 10132K 4604K select 0 907:39 0.00% ppmd
1286 root 1 76 0 19128K 8512K select 0 366:29 0.00% cosd
1309 root 1 76 0 27640K 10876K select 0 317:44 0.00% mib2d
1271 root 1 76 0 12336K 5652K select 0 241:54 0.00% license-check
1304 root 1 4 0 22948K 12504K kqread 0 156:34 0.00% eswd
1247 root 1 76 0 115M 17992K select 0 137:12 0.00% chassisd
1282 root 1 76 0 20372K 8764K select 0 136:24 0.00% l2ald
1280 root 1 4 0 54828K 28532K kqread 0 117:35 0.00% rpd
1263 root 1 76 0 15632K 3588K select 0 95:05 0.00% shm-rtsdbd
1313 root 2 76 0 24384K 9084K select 0 76:17 0.00% pfed
1270 root 1 76 0 24252K 15688K select 0 54:39 0.00% utmd
1269 root 1 76 0 13460K 5952K select 0 50:16 0.00% rtlogd
1283 root 1 76 0 14904K 6664K select 0 48:17 0.00% vrrpd
1305 root 1 4 0 17056K 7296K kqread 0 22:26 0.00% lldpd
1297 root 1 76 0 14460K 5992K select 0 12:35 0.00% pkid
1287 root 1 76 0 23084K 7836K select 0 12:32 0.00% kmd
1268 root 1 76 0 30016K 8896K select 0 6:12 0.00% idpd
1259 root 1 76 0 13036K 6332K select 0 5:14 0.00% bfdd
"The flow_octeon_hm always shows high cpu utilization because they have the process in a tight loop looking for packets to forward."
Really enjoy the blog!
Ivan, I heard recently that AWS is also beginning to expand using SR-IOV in their cloud. I'm just curious how cloud providers support vMotion and other advanced features with SR-IOV? It's usually a tradeoff, you get either SR-IOV or VM mobility but not both
Cisco+VMware have an interesting solution with VM-FEX:
http://blog.ipspace.net/2012/03/cisco-vmware-merging-virtual-and.html
Future state architecture it probably doesn't need to run on the same machine, both Cisco and Juniper are looking to virtualize the control plane and control the hardware whether it be an x86 server or a HW chassis. Juniper has had this for years in the TX-Matrix, guess it's finally caught on again. One issue I saw was there wasn't a way to tie the physical interface state to the underlying VM, it required running a 3rd party tool in the host itself to synchronize state, but I'm sure they've figured it out by now. I believe starting with 14.2 you can upgrade a vMX the same way you upgrade any other Juniper router with a jinstall package.
It does require having Intel 10GB NICs in the box to make use of the DPDK, but most people use those anyways. Even with a modest amount of higher touch things provisioned like firewall filters, etc. in the 512K+ byte packet ranges the throughput is pretty good but nowhere what you would get out of even a MX80. Don't plan on running more than one of these on a server either.
I'm actually interested to see how well it could work on newer "network appliance" servers from people like Advantech or Lanner.
I'm also interested to see the ALU VSR which claims even higher throughput using the same DPDK. They are already down the road of being able to cluster multiple servers into a single router. They are supposed to do a demo soon with 2Tbps worth of x86 bandwidth managed as a single router.
I've heard that IPSec might be included soon. I could use nat with minimal algs (DNS and ftp) for a vCPE.
Vyatta 5600 should be able to support 80Gbps throuthput.
https://www.sdxcentral.com/articles/news/brocade-telefonica-push-vyattas-virtual-router-80g/2014/08/
https://www.sdxcentral.com/wp-content/uploads/2014/10/Vyatta-5600-Performance-Test-Executive-Summary.pdf