Category: scalability
Scaling Overlay Virtual Networks: The Problem
Every major hypervisor and networking vendor has an overlay virtual networking solution. Obviously they’re not identical, and some of them work better than others in large-scale environments – an interesting challenge we tried to address in the Scaling Overlay Virtual Networks webinar. As always, we started by identifying the potential problems.
Just Published: Scaling Overlay Virtual Networking Videos
The edited videos for Scaling Overlay Virtual Networking webinar are available on ipSpace.net Content site. Nuage Networks sponsored the webinar; the videos are thus publicly available (without registration).
Scaling Distributed Systems Is Hard
Stumbled upon a hilarious description of challenges encountered when trying to scale distributed systems (cluster of controllers running centralized control plane comes to mind).
It starts with “If someone tells you that scaling out a distributed system is easy they are either lying or drunk, and possibly both,” and gets better and better. Enjoy!
Does a Cloud Orchestration System Need an Underlying SDN Controller?
A while ago I had an interesting discussion with a fellow SDN explorer, in which I came to a conclusion that it makes no sense to insert an overlay virtual networking SDN controller between cloud orchestration system and virtual switches. As always, I missed an important piece of the puzzle: federation of cloud instances.
2014-11-04 16:48Z: CJ Williams sent me an email with information on SDN controller in upcoming Windows Server release. Thank you!
Scalability Enhancements in Cisco Nexus 1000V
The latest release of Cisco Nexus 1000V for vSphere can handle twice as many vSphere hosts as the previous one (250 instead of 128). Cisco probably did a lot of code polishing to improve Nexus 1000V scalability, but I’m positive most of the improvement comes from interesting architectural changes.
Controller Implementation Choices Affecting OpenFlow Scalability
The first part of the Real-life OpenFlow Use Cases webinar focused on controller design and implementation choices that can significantly impact the scalability of an OpenFlow solution:
- Proactive versus reactive flow setup;
- Hop-by-hop versus path-based forwarding;
- State explosion with OpenFlow 1.0;
You could tell we had great fun with these topics: we spent more than half an hour on five slides.
Are Overlay Networking Tunnels a Scalability Nightmare?
Every time I mention overlay virtual networking tunnels someone starts worrying about the scalability of this approach along the lines of “In a data center with hundreds of hosts, do I have an impossibly high number of GRE tunnels in the full mesh? Are there scaling limitations to this approach?”
Not surprisingly, some ToR switch vendors abuse this fear to the point where they look downright stupid (but I guess that’s their privilege), so let’s set the record straight.
50 Shades of Statefulness
A while ago Greg Ferro wrote a great article describing integration of overlay and physical networks in which he wrote that “an overlay network tunnel has no state in the physical network”, triggering an almost-immediate reaction from Marten Terpstra (of RIPE fame, now @ Plexxi) arguing that the network (at least the first ToR switch) knows the MAC and IP address of hypervisor host and thus has at least some state associated with the tunnel.
Marten is correct from a purely scholastic perspective (using his argument, the network keeps some state about TCP sessions as well), but what really matters is how much state is kept, which device keeps it, how it’s created and how often it changes.
How big is a big private cloud?
During the UCS Director Overview Packet Pushers Podcast I listened to recently the participants started discussing the use cases and someone mentioned that UCS Director might not be applicable for small shops with only a few thousand VMs. Let's put that in perspective.
Virtual Networks: the Skype Analogy
I usually use the “Nicira is Skype of virtual networking” analogy when describing the differences between Nicira’s NVP and traditional VLAN-based implementations. Cade Metz liked it so much he used it in his What Is a Virtual Network? It’s Not What You Think It Is article, so I guess a blog post is long overdue.
Before going into more details, you might want to browse through my Cloud Networking Scalability presentation (or watch its recording) – the crucial slide is this one:
Designing Scalable Web Applications: Introduction
My regular readers probably know that I’m running a 4-month course in scalable web application design at University of Ljubljana (everyone else will find more details here). I was extremely surprised when we started – I’d expected to see about a dozen students, and suddenly realized I was standing in front of a totally crowded classroom. The next amazing surprise was the students’ level of motivation, commitment, knowledge, and the quality of their questions. It’s definitely fun to have an audience like that.