VXLAN, OTV and LISP
Immediately after VXLAN was announced @ VMworld, the twittersphere erupted in speculations and questions, many of them focusing on how VXLAN relates to OTV and LISP, and why we might need a new encapsulation method.
VXLAN, OTV and LISP are point solutions targeting different markets. VXLAN is an IaaS infrastructure solution, OTV is an enterprise L2 DCI solution and LISP is ... whatever you want it to be.
VXLAN tries to solve a very specific IaaS infrastructure problem: replace VLANs with something that might scale better. In a massive multi-tenant data center having thousands of customers, each one asking for multiple isolated IP subnets, you quickly run out of VLANs. VMware tried to solve the problem with MAC-in-MAC encapsulation (vCDNI), and you could potentially do the same with the right combination of EVB (802.1Qbg) and PBB (802.1ah), very clever tricks a-la Network Janitor, or even with MPLS.
Compared to all these, VXLAN has a very powerful advantage: it runs over IP. You don’t have to touch your existing well-designed L3 data center network to start offering IaaS services. The need for multipath bridging voodoo magic that a decent-sized vCDNI deployment would require is gone. VXLAN gives Cisco and VMware the ability to start offering reasonably-well-scaling IaaS cloud infrastructure. It also gives them something to compete against Open vSwitch/Nicira combo.
Reading the VXLAN draft, you might notice that all the control-plane aspects are solved with handwaving. Segment ID values just happen, IP multicast addresses are defined at the management layer and the hypervisors hosting the same VXLAN segment don’t even talk to each other, but rely on layer-2 mechanisms (flooding and dynamic MAC address learning) to establish inter-VM communication. VXLAN is obviously a QDS (Quick-and-Dirty-Solution) addressing a specific need – increasing the scalability of IaaS networking infrastructure.
VXLAN will indeed scale way better than VLAN-based solution, as it provides total separation between the virtualized segments and the physical network (no need to provision VLANs on the physical switches), it will scale somewhat better than MAC-in-MAC encapsulation because it relies on L3 transport (and can thus work well in existing networks), but it’s still a very far cry from Amazon EC2. People with extensive (bad) IP multicast experience are also questioning the wisdom of using IP multicast instead of source-based unicast replication ... but if you want to remain control-plane ignorant, you have to rely on third parties (read: IP multicast) to help you find your way around.
It seems there have already been claims that VXLAN solves inter-DC VM mobility (I sincerely hope I’ve got a wrong impression from Duncan Epping’s summary of Steve Herrod’s general session @ VMworld). If you’ve ever heard about traffic trombones, you should know better (but it does prove a point @etherealmind made recently). Regardless of the wishful thinking and beliefs in flat earth, holy grails and unicorn tears, a pure bridging solution (and VXLAN is no more than that) will never work well over long distances.
Here’s where OTV kicks in: if you do become tempted to implement long-distance bridging, OTV is the least horrendous option (BGP MPLS-based MAC VPN will be even better, but it still seems to be working primarily in PowerPoint). It replaces dynamic MAC address learning with deterministic routing-like behavior, provides proxy ARP services, and stops unicast flooding. Until we’re willing to change the fundamentals of transparent bridging, that’s almost as good as it gets.
As you can see, it makes no sense to compare OTV and VXLAN; it’s like comparing a racing car to a downhill mountain bike. Unfortunately, you can’t combine them to get the best of both worlds; at the moment, OTV and VXLAN live in two parallel universes. OTV provides long-distance bridging-like behavior for individual VLANs, and VXLAN cannot even be transformed into a VLAN.
LISP is yet another story. It provides very rudimentary approximation to IP address mobility across layer-3 subnets, and it might be able to do it better once everyone realizes hypervisor is the only place to do it properly. However, it’s a layer-3 solution running on top of layer-2 subnets, which means you might run LISP in combination with OTV (not sure it makes sense, but nonetheless) and you could be able to run LISP in combination with VXLAN once you can terminate VXLAN on a LISP-capable L3 device.
So, with the introduction of VXLAN, the networking world hasn’t changed a bit: the vendors are still serving us all isolated incompatible technologies ... and all we’re asking for is tightly integrated and well-architected designs.
Another point that's not widely appreciated is that getting in and out of these VxLANs starts to look a whole lot like routing, very likely creating islands of VxLANs within the provider routed out to the rest of the world.
And while I agree that 'introduction of VXLAN, the networking world hasn’t changed a bit: the vendors are still serving us all isolated incompatible technologies', I'll also argue it also makes the remaining gaps more apparent and important.
(That is to say, I believe http://tools.ietf.org/html/draft-ietf-l2vpn-pbb-evpn-04 replaces http://tools.ietf.org/html/draft-raggarwa-mac-vpn-01 ...)
Sound about right?
Any updated thoughts on OTV/VxLAN(/FabricPath)/etc. when factoring PBB-EVPN into the conversation?
Why do you consider that PBB-EVPN is not a good technology for interconnecting DC? Do you know if there is an existing comparison between PBB-EVPN and OTV to see which technology is the best one in this case?
In a secure DC with multiple security zones who will allow traffic to tunnel through firewalls?
Which is what is proposed by LISP upto now from what I can see.
So at some point outside the DC. The correct entrance point to the DC based on the VM location is chosen by consulting the Mapping Server. The traffic is then tunneled into the DC in LISP encapsulation which can't be properly inspected by the firewall.. Unless this point has been changed.
Since the traffic are proxied (tunneled) between PITR to ETR, the route path for non LISP enabled sites is: Client-PITR-ETR-EID. This might not be the optimal path and may have performance impact that hindering the adoption of this technology.
Make it as one statement: Enable LISP will only benefit customers that having LISP enabled which is a minority of the Internet community. The majority of the customers may get a worse performance until switching to LISP. This makes me thinking adopting LISP might not be a wise decision.
If none LISP enabled customers having the same performance, that might be a easier decision.
If LISP is where Internet goes, a shared (or free) PITR infrastructure may excellerate the implementation.
I am still not convinced why not just improve the MPLS encapsulation rather than introduce a whole new system to solve some old issues.
Things have been changed a lot after a decade. The operation on control/data plane makes every wanted capabilities available.
Now we have BGP/EVPN can use Vxlan purely as encapsulation, the Cisco's sturborn way to do EVPN with LISP/Vxlan (SDA).
I yet to understand Cisco's consideration on LISP over BGP, also LISP can be used as encapsulation but why it is not there even now?