Blog Posts in November 2013
I had three SDN 101 presentations during last week’s visit to South Africa and had tried really hard to overcome my grumpy skeptic self and find the essence of SDN while preparing for them. As I’ve been thinking about controllers, central visibility and network device programmability, it struck me: we already had SDN in 1993.
In the first Terastream blog post I mentioned Deutsche Telekom decided to use an IPv6-only access network. Does that mean they decided to go down the T-Mobile route and deployed NAT64 + 464XLAT? That combo wouldn’t work well for them, and they couldn’t use MAP-E due to lack of IP address space, so they deployed yet another translation mechanism – Lightweight 4over6.
The easiest way of connecting overlay virtual networks implemented with VMware NSX for vSphere to the outside world is NSX Edge Services Router. It’s a much improved version of vShield Edge and provides way more than just layer-3 forwarding services – it’s also a firewall, load balancer, DHCP server, DNS forwarder, NAT and VPN termination device.
Dan sent me the following question:
I had another read of the ‘Building IPv6 Service Provider Networks’ material and can see the PE routers use site local ipv6 addressing. I’m about to build another small service provider setup and wondered: would you actually use site local for PE loopbacks etc, or would you use ULA or global addressing? I’m thinking ULA would be better from a security point of view?
TR&DR summary: Don’t do that.
Even though I questioned the wisdom of writing your own network programming applications, I know I would immediately jump into those waters if I were 20 years younger. If you’re like my younger self, you might want to keep a few guidelines in mind.
As one of their early marketing moves, VMware started promoting VMware NSX with a catchy “fact” – you can deploy a new VM or virtual disk in minutes, but it usually takes days or more before you can get a new VLAN or a firewall or load balancer rule from the networking team.
Ignoring the complexity of network virtualization, they had a point, and the network services rigidity really bothered me … until I finally realized that we’re dealing with a broken process.
All overlay virtual networking solutions look similar from far away: many provide layer-2 segments, most of them have some sort of distributed layer-3 forwarding, gateways to physical world are ubiquitous, and you might find security features in some products.
The implementation details (usually hidden behind the scenes) vary widely, and I’ll try to document at least some of them in a series of blog posts, starting with VMware NSX.
Almost a year ago rumors started circulating about a Deutsche Telekom pilot network utilizing some crazy new optic technology. In spring I’ve heard about them using NFV and Tail-f NCS for service provisioning … but it took a few more months till we got the first glimpses into their architecture.
TL&DR summary: Good design always beats bleeding-edge technologies
Major vendors (with the exception of NEC) haven’t made any progress. Juniper still hasn’t delivered on its promises. Cisco still hasn’t shipped an OpenFlow switch or an SDN controller (although they’ve announced both months ago). Brocade supposedly has OpenFlow on their high-end routers and Arista supports OpenFlow on its old high-end switch (but not in GA EOS release).
Every major vendor is talking about SDN, but it’s mostly SDN-washing (aka CLI-in-API-disguise). Cisco is talking about OnePK, and has shipping early adopter SDK kit, but it will take a while before we see OnePK in GA code on a widespread platform.
Startups aren’t doing any better. Big Switch is treading water and trying to find a useful use case for their controller. Nicira was acquired by VMware and is moving away from OpenFlow. Contrail was acquired by Juniper and recently shipped its product (which has nothing to do with OpenFlow and not much with SDN). LineRate Systems was acquired by F5 and disappeared.
We haven’t seen customer deployments either. Facebook is doing interesting things (but from what I’ve heard they’re not OpenFlow-based), Google has an OpenFlow/SDN deployment, but they could have done the exact same thing with classical routers and PCEP, Microsoft’s SDN is based on BGP (and works fine).
It seems like the reality hit OpenFlow and it was a very hard hit… and according to Gartner we haven’t reached the trough of disillusionment yet.
In late October I had the closing presentation at our yearly customer event, and decided to talk about one of the most pressing (at least in my opinion) IT problems – the technical debt from the networking/sysadmin perspective.
You can view the presentation on my web site. It’s one of those presentations that look way better on video (which will be published … but it’s in Slovenian), but I’m positive the meme-lovers will enjoy it.
Traditional data centers are usually built in a very non-scalable fashion: everything goes through a central pair of firewalls (and/or load balancers) with thousands of rules that no one really understands; servers in different security zones are hanging off VLANs connected to the central firewalls.
Some people love to migrate the whole concept intact to a newly built private cloud (which immediately becomes server virtualization on steroids) because it’s easier to retain existing security architecture and firewall rulesets.
If you repeat something often enough, it becomes a “fact” (or an urban myth). SDN is no exception, industry press loves to explain SDN like this:
[SDN] takes the high-end features built into routers and switches and puts them into software that can run on cheaper hardware. Corporations still need to buy routers and switches, but they can buy fewer of them and cheaper ones.
That nice soundbite contains at least one stupidity per sentence:
Early November seems to be the right time for data center product harvest: after last week’s Juniper launch, Arista launched its new switches on Monday. The launch was all we came to expect from Arista: better, faster, more efficient switches … and a dash of PureMarketing™ – the Splines.
You did read my blog post on ThousandEyes, didn’t you? What I forgot to mention was that they have this cool API that allows you to extract measurement data (including BGP topology) from their system. Can we do something cool with that?
The recent Juniper product launch included numerous components, among them: a new series of data center switches (including a badly-needed spine switch), MetaFabric reference architecture (too meta for me at the moment – waiting to see the technical documentation beyond the whitepaper level), and (finally) a leaf-and-spine virtual chassis – Virtual Chassis Fabric.
A while ago I had a discussion with someone who wanted to be able to move whole application stacks between different private cloud solutions (VMware, Hyper-V, OpenStack, Cloud Stack) and a variety of public clouds.
Not surprisingly, there are plenty of startups working on the problem – if you’re interested in what they’re doing, I’d strongly recommend you add CloudCast.net to your list of favorite podcasts – but the only correct way to solve the problem is to design the applications in a cloud-friendly way.