It Doesn’t Make Sense to Virtualize 80% of the Servers
A networking engineer was trying to persuade me of importance of hardware VXLAN VTEPs. We quickly agreed physical-to-virtual gateways are the primary use case, and he tried to illustrate his point by saying “Imagine you have 1000 servers in your data center and you manage to virtualize 80% of them. How will you connect them to the other 200?” to which I replied, “That doesn’t make any sense.” Here’s why.
How many hypervisor hosts will you need?
Modern servers have ridiculous amounts of RAM and CPU cores as I explained in the Designing Private Cloud Infrastructure webinar. Servers with 512 GB of RAM and 16 cores are quite common and becoming relatively inexpensive.
Assuming an average virtualized server needs 8 GB of RAM (usually they need less than that) you can pack over 60 virtualized servers into a single hypervisor hosts. The 800 virtualized servers thus need less than 15 physical servers (for example, four Nutanix appliances), or 30 10GE ports – less than half a ToR switch.
Back to the physical world
The remaining 200 physical servers need 400 ports, most commonly a mixture of everything from Fast Ethernet to 1GE and (rarely) 10GE. Mixing that hodgepodge of legacy gear with high-end hypervisor hosts and linerate 10GE switches makes no sense.
What should you do?
I’ve seen companies doing network refreshes without virtualizing and replacing the physical servers. They had to buy almost-obsolete gear to get 10/100/1000 ports required by existing servers, and thus closed the doors for 10GE deployment (because they won’t get new CapEx budget for then next 5 years).
Don’t do that. When you’re building a new data center network or refreshing an old one, start with its customers – the servers: buy new high-end servers with plenty of RAM and CPU cores, virtualize as much as you can, and don’t mix the old and the new world.
This does require synchronizing your activities with the server and virtualization teams, which might be a scary and revolutionary thought in some organizations; we’ll simply have to get used to talking with other people.
Use one or two switches as L2/L3 gateways, and don’t even think about connecting the old servers to the new infrastructure. Make it abundantly clear that the old gear will not get any upgrades (the server team should play along) and that the only way forward is through server virtualization… and let the legacy gear slowly fade into obsolescence.
Designing a new data center network?
You’ll get design guidelines and technology deep dives in various data center, cloud computing and virtualization webinars, and you can always use me to get design help or a second opinion.
Also hardware appliances for network services like firewalls, load balancers, IPS and also Citrix access gateways are often cheaper than their virtual licenses.
A) Those exceptions usually don't represent 20% of the servers (or ports)
B) It still doesn't make sense to mix then with the hypervisor hosts on the same ToR switches.
Can you expand on why it doesn't make sense to mix those appliances and legacy servers that absolutely can't be virtualized onto the same ToR switches as the hypervisor hosts?
More than likely, your virtual farm and big iron are going to be separately racked and cabled anyway.
I am not fully convinced with basic assumption of "We quickly agreed physical-to-virtual gateways are the primary use case". Would rather look at the problem from controller's scalabilty and performance point of view. That is where would one deploy VTEPs, is it on hypervisors or or on ToR's.
Consider a different usecase with 50K VM's, at 60VM's per physical host ~825+ physical hosts (all virtualized). Assuming 5 VM's per VNI, about 10K's VNI's and each VM's of a given tenant reside in different physical host.
If one were to have VTEPs at the hyper-visors for the usecase considered. The performance numbers are as follows
1) 2 TCP connection with each hyper visors. One for OVSDB and another for OF (With NSX or with ODL). So the controller has to handle about 1500+ TCP connections just for managing the hypervisors.
2) If OF-1.0 is used, #virtual ports created on a single physical host are 60 * 5 = 300/physical host. So the controller to handle 300 * 825 ~ 25K virtual ports. Agree this number is reduced when OF1.3 is used. At this don't have numbers to what extent.
3) #flows programmed by the controller also increases as flows are programmed by the controller.
4) Controller to manage 825+ physical hosts to distribute VM routes.
On the other hand, if the VTEPs are deployed at ToR switch, with 30 10GE Ports
A) #TCP connection to controller is 25+. We only need OVSDB connection and don't require OF, as solution like NSX leave the programming of flows to HW vendor instead of using OF.
B) As there is no OF in the picture, controller need not bother about creating virtual-ports/handling flow entries etc.
C) Controller to manage only 25+ HW VTEPs to distribute VM routes.
So, to summarize scalability of the controller becomes important point for choosing hardware VxLAN GWs
2) You don't need virtual ports like you think you do. Read
http://blog.ipspace.net/2013/08/are-overlay-networking-tunnels.html
and comments to it.
3) Number of forwarding entries isn't that different from the VTEP case, and the forwarding entries cost you less than the hardware ones.
4) So what? What's the number of changes-per-second?
Finally, with all the questions you're asking, I think it's time for full disclosure: who are you working for?