Load sharing 101 (with references)

It looks like my load sharing posts (as well as related IP corner and wiki articles) did not paint the whole picture; I’m always assuming the readers have a basic level of IP routing knowledge (somewhere around BSCI/CCNP) and jump into juicy details. Let’s try to fix this error and start from the beginning.

A router receives its routing information (reachability of IP prefixes) from various sources: connected IP prefixes, static routes and dynamic routing protocols. For every IP prefix, the best source (= one with the lowest administrative distance) is selected and only the route(s) from that source are included in the IP routing table.

If the best source offers multiple equal-cost routes, more than one (up to the value of maximum-paths router configuration command) can be installed in the IP routing table and used for load sharing.

Things can get tricky when you use BGP, read the Load balancing in BGP networks IP corner article for details.

What happens next depends on the forwarding (switching) mechanism used by the router. All switching mechanisms perform best-prefix match: they use the entry (or entries) that are the closest match for the destination IP address. If the best prefix has multiple entries in the IP routing table (traffic toward it can be load-shared), the action varies by switching mechanism:

Process switching: The router performs an IP routing table lookup for every forwarded packet and a selects one out of several possible entries in a round-robin fashion, resulting in per-packet load balancing.

Fast switching: Initial packet is process-switched (see above) and a cache entry is created based on the results of the IP routing table lookup. Further packets use the cached entry and thus always use only one of the possible routes.

There rules are somewhat crazy; sometimes fast switching would create per-host entries (resulting in better spread of the load), but in most cases (including the worst one: two default routes) it behaves as described.

CEF switching: information from IP routing table is evaluated (including recursive lookups) and transferred into CEF FIB (forwarding information base). For every IP prefix with multiple entries in the IP routing table, CEF creates a table of 16 slots and populates them with alternate routes to the IP prefix (in unequal-cost load sharing some routes are used more often than the others).

CEF supports per-destination, per-packet and per-session load sharing. Per-packet load sharing is obvious: the 16 slots are used in a round-robin fashion. Per-destination load sharing takes the source and destination IP address from the forwarded packet, scrambles (the correct term is hashes) them together to get a 4-bit number (between 0 and 15) and selects the route from the corresponding slot in the 16-slot table to forward the packet. Per-session load sharing uses source and destination UDP/TCP ports together with the source and destination IP address in the hashing function.

6 comments:

  1. Your ardent student09 December, 2009 19:09

    When it comes to driving points home, you are undoubtedly gifted. Ever considered cloning yourself? :->

    ReplyDelete
  2. Supposedly it's illegal in a few countries, the results are somewhat unpredictable and using the current technology the process takes a while (probably between 20-25 years to pass CCIE and then a few more to gain some experience).

    ReplyDelete
  3. Hi,
    Great post....where can i get how the hashing works as described above. I would like to understand more detail how the hashing works....any reference sites...

    Thanks

    ReplyDelete
  4. Ah, the famous hashing fnuction. You probably need access to the Cisco IOS source code.

    ReplyDelete
  5. Ivan,
    Why is it whenever load balancing is discussed, OER/PfR are never mentioned? I mean, I had plenty of issues trying to deploy OER, but I never read about it from a IOS pro.
    I suppose I must be shining a newbie light on myself... but I've only seen OER mentioned in lab discussions.

    ReplyDelete
  6. Looks like it's time to write about it ;)

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.