Extending Layer-2 Connection into a Cloud

Carlos Asensio was facing an “interesting” challenge: someone has sold a layer-2 extension into their public cloud to one of the customers. Being a good engineer, he wanted to limit the damage the customer could do to the cloud infrastructure and thus immediately rejected the idea to connect the customer straight into the layer-2 network core ... but what could he do?

Overlay virtual networks just might be a solution if you have to solve a similar problem:

  • Build the cloud portion of the customer’s layer-2 network with an overlay virtual networking technology;
  • Install an extra NIC in one (or more) physical host and run a VXLAN-to-VLAN gateway in a VM on that host – the customer’s VLAN is thus completely isolated from the data center network core;
  • Connect the extra NIC to WAN edge router or switch on which the customer’s link is terminated. Whatever stupidity the customer does in its part of the stretched layer-2 network won’t spill further than the gateway VM and the overlay network (and you could easily limit the damage by reducing the CPU cycles available to the gateway VM).

The diversity of overlay virtual networking solutions available today gives you plenty of choices:

  • You could use Cisco Nexus 1000V with VXLAN or OVS/GRE/OpenStack combo at no additional cost (combining VLANs with GRE-encapsulated subnets might be an interesting challenge in current OpenStack Quantum release);
  • VMware’s version of VXLAN comes with vCNS (a product formerly known as vShield), so you’ll need a vCNS license;
  • You could also use VMware NSX (aka Nicira NVP) with a layer-2 gateway (included in NSX).

Hyper-V Network Virtualization might have a problem dealing with dynamic MAC addresses coming from the customer’s data center – this is one of the rare use cases where dynamic MAC learning works better than a proper control plane.

VXLAN-to-VLAN gateway linking the cloud portion of the customer’s network with the customer’s VLAN could be implemented with Cisco’s VXLAN gateway or a simple Linux or Windows VM on which you bridge the overlay and VLAN interfaces (yet again, one of those rare cases where VM-based bridging makes sense). Arista’s 7150 or F5 BIG-IP is probably an overkill.

And now for a bit of totally unrelated trivia: once we solved the interesting part of the problem, I asked about the details of the customer interconnect link – they planned to have a single 100 Mbps link and thus a single path of failure. I can only wish them luck and hope they’ll try to run stretched clusters over that link.

9 comments:

  1. I caught the summary of this article in Feedly ("Carlos Asensio was facing an 'interesting' challenge: someone has sold a layer-2 extension into their public cloud to one of the customers.") and immediately attempted to re-enact that scene from Empire Strikes Back by shouting out loud: "NOOOOOOOOOOOOOO!!"

    Oh and I agree, stretched clusters over a L2 extension with a SPOF sounds like a great idea. KABOOM! ;)
  2. Or they can integrate a product like the Nexus 1000V Intercloud. I know you know all about it, but just in case: http://www.jedelman.com/1/post/2013/06/hybrid-cloud-networking-with-ciscos-nexus-1000v-intercloud.html

    Probably a bit more work upfront, but a nice offering to have as a cloud provider.
  3. Most of the Hybrid cloud solution available in market seems to be software based solution (Intercloud, Cloudswitch, Brocade also has one i believe). So, is there a expectation that bandwidth for North-South traffic (Inter/Intra-DC, hybrid cloud etc) is less and a software GW is sufficient ?
    Replies
    1. No reason you can't use multiple gateways. Do you really want a SPOF anyway? :)
    2. Can always have multiple HW GW's for SPOF but would be expensive. The bottle neck with SW solution would probably be performance but It all depends on amount of traffic between the DC's.
  4. Hyper-V dynamic mac (and IP) learning is solved in 2012R2 release.
    Replies
    1. Not sure it does dynamic MAC learning - it seems you still have to define virtualization lookup records with known MAC address and type L2Only.

      Dynamic IP learning - yes, it does that.
  5. Getting OVS in Quantum to use both VLANs and MAC-GRE tunnels just comes down to knowing how to setup OVS to do it. Quantum provider extensions allow you to implement the VLAN while having the MAC-GRE as your default tenant networking scheme. We used the chef scripts from Rackspace private cloud distro and only had to overwrite attributes to make it work that way. Didn't need to change the recipe scripts at all.

    Here is what our OVS plugin config looks like on our compute and network nodes.

    [OVS]
    tenant_network_type = gre
    integration_bridge = br-int
    local_ip = X.X.X.X

    enable_tunneling = True
    tunnel_bridge = br-tun
    tunnel_id_ranges = 1:1000

    network_vlan_ranges = ph-eth3:2300:2399
    bridge_mappings = ph-eth3:br-eth3

    We setup our VLANs as provider networks (both shared and non-shared). We use them for both l3_agent based routing (OpenStack software NAT/SNAT) as well as to attached guest instance vifs directly to datacenter VLANs.

    You then can use quantum to setup the VLANs with:

    quantum net-create Net-2301 --provider:network_type vlan --provider:physical_network ph-eth3 --provider:segmentation_id 2301 --router:external true --shared

    Add some subnets to that network and you're in business. The subnets can be DHCP enabled. As long as the network nodes are setup with the same VLANs reachable on a NIC like the compute nodes, it launches your dnsmasq instance and it works just fine for guest instances. Just remember that the dnsmasq is hardcoded to a host file so it will not act as a DHCP server for other hosts on the VLAN because their MAC addresses are not on Quantum ports. Of course you don't have to use OpenStack's dhcp_agent on the network and you run your own DHCP server on the VLAN external to your cloud.

    Full Disclosure: I work for f5 networks... so I'm bias to what a real proxy can do in the world!

    We have f5 BIG-IP VEs with one leg on the SDN (MAC-GRE) and the other on data center VLANs. We use that for all the fun proxy ADC gateway tricks including SSL-VPN to the cloud resources which only have SDN interfaces. A simple BIG-IP startup script on a KVM instance (reading from Nova metadata) and they boot, license, and set up L2+L3 addressing right from Quantum (excuse me... Neutron) managed networks. It's ADCaaS.

    Reach out to me if you want to talk about BIG-IP to BIG-IP tunneling use cases for cloud bridging. While routing through encrypted and compressed iSession tunnels is preferred, BIG-IP has EthIP (L2) tunnelling and IPSEC too. EthIP is how BIG-IP supports LDVM for existing connections with vSphere between data centers (bad idea.. I know).
    Replies
    1. Thank you! Really appreciated!
Add comment
Sidebar