I’ve blogged about the need for optimal L3 forwarding across the whole data center almost a year ago when I introduced it as one of the interesting requirements in Data Center Fabrics webinar. A year later, there are still only a few companies that can deliver this functionality.
Fabric solutions that appear as a single system to the outside world usually offer optimal L3 forwarding. These solutions include:
- Stacking ToR switches and other similar solutions, including HP IRF and Juniper’s Virtual Chassis) definitely fall in this category (note: using stacked switches or virtual chassis architectures with ring-based interconnect in environments with heavy east-west traffic is NOT a good idea);
- Other architectures that present the whole fabric as a single layer-3 entity: Juniper’s QFabric and Plexxi’s Affinity Networks;
- Controller-based solutions like NEC’s Programmable Flow (more: ProgrammableFlow Basics and Virtual Tenant Networks).
Plexxi’s Affinity Networking is architecturally closer to QFabric than to OpenFlow-based networks, as the individual switches retain a large amount of intelligence and autonomy. More in a few weeks when we clean up the recording of Dan Backman explaining the Plexxi architecture during the recent Data Center Fabrics Update webinar.
However, there are only two companies I’m aware of that can do optimal L3 forwarding across the whole data center while using traditional network of independent devices: Arista with Virtual ARP and Enterasys with Fabric Routing.
Arista’s Virtual ARP is extremely simple – it’s like VRRP without VRRP. You have to configure the same IP address (first-hop gateway) on a VLAN interface of all ToR switches with ip virtual-router address configuration command and associate a MAC address with the shared IP address with the ip virtual-router mac-address interface configuration command.
The first switch that is hit with an ARP request for the shared virtual IP address will reply with the shared MAC address (I’m not sure about the details – it might well be that the ARP broadcast gets flooded to all switches, in which case the sender gets numerous replies). When a host sends an IP packet to that same shared MAC address, the first ToR switch that the packet hits intercepts the packet (because it’s listening to the shared MAC address), and performs L3 routing.
Things might get nasty if you have configuration mismatches – for example, missing ip virtual-router address configuration on one of the ToR switches – so make sure you use some sort of an orchestration system to configure the ToR switches. XMPP client implemented in Arista EOS might be good enough.