FullMesh added an excellent comment to my Multi-Chassis Link Aggregation (MLAG) and hot potato switching post. He wrote:
If there are two core routing switches and two access switches which are MLAGged together in both directions, and hosts that are dual-active LAGged to the pair of access switches, then the traffic would stay on whichever side the host places it.
He also opened another can of worms: load balancing in MLAG environment is dictated by the end hosts. It doesn’t pay to have fancy switches that support L3 or L4 load balancing; a stupid host implementing destination-MAC-address-based load balancing can easily ruin your day.
Let’s start with simple baseline architecture: two web servers and a router connected to a switch. Majority of the traffic flows from the web servers through the router to outside users.
In this architecture, the switch can reshuffle the packets based on its load balancing algorithm regardless of the load balancing algorithm used by the servers. Even if the servers use source-destination-MAC algorithm (which would send all the traffic over a single link), the switch can spread packets sent to different destination IP addresses1 over both links toward the router. As long as the links between the servers and the switch aren’t congested, we don’t really care about the quality of the load balancing algorithm the servers use.
Now let’s make the architecture redundant, introducing a second switch and combining the two switches into a multi-chassis link aggregation group:
All of a sudden, the switches (using hot potato switching) can no longer influence the traffic flow toward the router. If all the hosts decide to send their outgoing traffic toward S1, the link S1-R will be saturated even though the link S2-R will remain idle. The quality of the servers' load balancing algorithm becomes vital.
Finally, let’s add two more links between the switches and the router:
In this architecture, each switch can use its own load balancing algorithm on the directly connected links: if all the hosts decide to send outbound traffic to S1, S1 can still load-balance the traffic according to its own rules on the parallel links between S1 and R. Of course it still cannot shift any of the traffic toward S2.
- Multi-chassis Link Aggregation (and numerous other LAN, SAN and virtualization technologies) is described in the Data Center 3.0 for Networking Engineers webinar (buy a recording or yearly subscription).
- Read my posts about Multi-chassis Link Aggregation basics, Stacking on Steroids and External Brains architectures.
- The load balancing issues described in this article are caused by the hot potato switching.
Or even packets from different TCP/UDP sessions sent to the same destination if you configured 5-tuple load balancing. ↩︎