The Difference between Metro Ethernet and Stretched Data Center Subnets

Every time I rant about large-scale bridging and stretched L2 subnets, someone inevitably points out that Carrier (or Metro) Ethernet works perfectly fine using the same technologies and principles.

I won’t spend any time on the “perfectly fine” part, but focus on the fundamental difference between the two: the use case.

Typical Metro Ethernet Use Case

Engineers who know what they’re doing connect individual sites to Metro Ethernet services with layer-3 devices (some others will eventually figure it out after a meltdown or two).

It doesn’t matter whether you call the site edge devices routers or switches, they perform several critical functions:

  • They split the inside (your site) and the outside (service provider transport network) into two separate L3 subnets and two failure domains;
  • They run routing protocols. Other devices attached to the same Metro Ethernet service can thus figure out whether a site is reachable or not;
  • They can find alternate paths (if they exist) after a link or service failure.

In principle, the routers connecting your sites to a Metro Ethernet service treat that service as one of the potential transport networks, and can use the routing protocols or BFD/CFM to figure out when the Metro Ethernet service is gone even if the local link status doesn’t change.

Worst case, if the Metro Ethernet service falls apart, and you’ve provisioned backup links, your sites can still communicate with each other. If the Metro Ethernet service experiences a severe meltdown, the hosts inside your sites will not be affected (the routers might be due to heavy CPU load induced by broadcasts received from Metro Ethernet LAN).

Summary: it’s perfectly safe to use layer-2 transport network as long as you terminate it with a layer-3 device.

Typical Stretched Data Center Subnet Use Case

Hosts are directly attached to stretched layer-2 subnets (VLANs) in a typical layer-2 data center interconnect design, as shown in the next diagram.

The servers (IP hosts) attached to stretched VLANs usually have no routing intelligence; all they know are two simple rules:

  • If the destination IP address belongs to the same subnet, use ARP to find the MAC address of the other host, and send the IP packet to that MAC address. If the ARP request fails, the other host is unreachable.
  • Otherwise, send the IP packet to the IP address of the default gateway.

The lack of routing intelligence in typical servers is not a software/OS issue. Linux and z/OS support routing daemons, and so did Windows Server 2003 until it got lobotomized (around the time of Windows Server 2008). However, it seems many engineers think naked singularity would materialize and gobble up their whole data center if they configured a routing protocol on a server (hint: EBGP is better than OSPF).

Typical IP hosts have no means of detecting the VLAN failure or partitioning, and cannot find alternate paths. They rely on network devices providing the connectivity, and with no layer-3 intelligence in the path, there’s only so much the networking devices can do.

The layer-2 data center interconnect thus becomes the most critical part of the whole data center infrastructure – if it breaks, everything else stops working (assuming the servers or VMs in the same subnet are on both ends of the failure). Is that a good idea? Not in my book.

6 comments:

  1. I was working at a nameless large financial institution recently and I was roundly mocked for recommended that Metro Ethernet be terminated with a layer-3 device.....

    Some people just won't listen :-)
  2. Some carriers are smart enough to instruct the customers to use a L3 device to terminate the Metro Ethernet service, and restrict the number of MAC addresses allowed to connect to that service.
  3. > it seems many engineers think naked singularity would materialize and gobble up their whole data center if they configured OSPF on a server.

    Not a naked singularity, but close - their network admin, armed with a local flavour of a pacifier. Consequences will likely be quite close to the aforementioned, too. ;)
  4. > They split the inside (your site) and the outside (service provider transport network) into two separate L3 subnets and two failure domains

    A Service Provider offering Metro Ethernet service cannot force the customer to use routers at the end of eline/etree/elan. So such a service is equal to L2 DCI and the core network needs to be prepared against a meltdown.

    What is the reason of the meltdown in L2? Intermittent loops and excessive flood. Both can be limited. The most dangerous is a loop. It can happen and happens also in Metro Ethernet networks because customers are connecting L2 devices (also two sites using a backdoor link). Does it mean that Metro Ethernet should not be offered? It is reasonable to find solutions to eliminate meltdown threats and there are such mechanisms which are far better than a normal STP.

    But I agree with you in terms of Data Center Interconnect. If there is a L3 choice take it first. This is a simple strategy how to avoid a risk.

    >Typical IP hosts have no means of detecting the VLAN failure or partitioning, and cannot find alternate paths. They rely on network devices providing the connectivity,

    IMHO this is also true for Metro Ethernet with L3 devices.
  5. I think you have confused the situation with a bit of bias:
    - Metro ethernet has good path protection schemes which you have discounted in your analysis.
    - Metro ethernet has filters which are available and broadcast containment is not the exclusive domain of L3
    - You eat your own dogfood with QinQ (what you put in is what you take out) The Metro ethernet network does not melt, it is typically your own dogfood that is unpalatable.

    However, I agree with L3 termination but not for the reasons you have given:
    - traffic shaping is required to match the purchased bandwidth
    - paths need to be secured and optimized

    Streched data centre vlans are not a problem if there is sufficient capacity to disregard shaping requirements and the paths are secure. Most storage using FC is stretched.

    Meltdowns are not exclusive to L2 stupidity, there is an equal if not larger amount of L3 stupidity.
  6. Let's say that due to circumstances outside of your control, you must have stretched data center subnets... What is the best method to get these subnets into OSPF? Should they share a common area at each data center or should each data center utilize a separate area for the same subnet?
Add comment
Sidebar