CEF load sharing details
I had to investigate the details of CEF load sharing for one of my upcoming article and found (yet again) that the details are rather undocumented in official documentation. So, this is how it works (in case you ever need to know):
- For every CEF entry (IP route) where there are multiple paths to the destination, the router creates a 16-row hash table, populating the entries with pointers to individual paths. The hash table can be inspected with the show ip cef prefix internal command.
- The load balancing ratio is approxiated by number of entries in the hash table belonging to each path. If you have unequal-cost load balancing (EIGRP based on composite metrics and MPLS TE tunnels based on requested bandwidth), individual paths will be associated with different number of rows.
- If you configure per-destination load balancing, the source and destination IP address in the incoming IP packet are hashed into a 4-bit value that selects the outgoing path in the CEF has table.
If this sounds confusing, here are two examples to make it easier: if you have two equal-cost paths to the same destination, each path will have eight entries in the hash table.
a1#show ip route 192.168.0.0
Routing entry for 192.168.0.0 255.255.255.0
Known via "ospf 1", distance 110, metric 51, type intra area
Last update from 172.16.0.21 on Serial0/0/0.100, 00:00:05 ago
Routing Descriptor Blocks:
* 172.16.0.21, from 172.16.0.22, 00:00:05 ago, via Serial0/0/0.100
Route metric is 51, traffic share count is 1
172.16.0.21, from 172.16.0.22, 00:00:05 ago, via Serial0/0/0.200
Route metric is 51, traffic share count is 1
a1#show ip cef 192.168.0.0 internal
192.168.0.0/24, version 33, epoch 0, per-destination sharing
0 packets, 0 bytes
via 172.16.0.21, Serial0/0/0.100, 0 dependencies
traffic share 1
next hop 172.16.0.21, Serial0/0/0.100
valid adjacency
via 172.16.0.21, Serial0/0/0.200, 0 dependencies
traffic share 1
next hop 172.16.0.21, Serial0/0/0.200
valid adjacency
0 packets, 0 bytes switched through the prefix
tmstats: external 0 packets, 0 bytes
internal 0 packets, 0 bytes
Load distribution: 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 (refcount 1)
Hash OK Interface Address Packets
1 Y Serial0/0/0.100 point2point 0
2 Y Serial0/0/0.200 point2point 0
3 Y Serial0/0/0.100 point2point 0
4 Y Serial0/0/0.200 point2point 0
5 Y Serial0/0/0.100 point2point 0
6 Y Serial0/0/0.200 point2point 0
7 Y Serial0/0/0.100 point2point 0
8 Y Serial0/0/0.200 point2point 0
9 Y Serial0/0/0.100 point2point 0
10 Y Serial0/0/0.200 point2point 0
11 Y Serial0/0/0.100 point2point 0
12 Y Serial0/0/0.200 point2point 0
13 Y Serial0/0/0.100 point2point 0
14 Y Serial0/0/0.200 point2point 0
15 Y Serial0/0/0.100 point2point 0
16 Y Serial0/0/0.200 point2point 0
However, if you have three equal-cost paths to the destination, each path will have only five entries and the hash table will have 15 rows instead of 16.
a1#show ip route 192.168.0.0
Routing entry for 192.168.0.0 255.255.255.0
Known via "ospf 1", distance 110, metric 51, type intra area
Last update from 10.0.0.6 on FastEthernet0/0, 00:00:02 ago
Routing Descriptor Blocks:
* 172.16.0.21, from 172.16.0.22, 00:00:02 ago, via Serial0/0/0.100
Route metric is 51, traffic share count is 1
172.16.0.21, from 172.16.0.22, 00:00:02 ago, via Serial0/0/0.200
Route metric is 51, traffic share count is 1
10.0.0.6, from 172.16.0.22, 00:00:02 ago, via FastEthernet0/0
Route metric is 51, traffic share count is 1
a1#show ip cef 192.168.0.0 internal
192.168.0.0/24, version 44, epoch 0, per-destination sharing
0 packets, 0 bytes
via 172.16.0.21, Serial0/0/0.100, 0 dependencies
traffic share 1
next hop 172.16.0.21, Serial0/0/0.100
valid adjacency
via 172.16.0.21, Serial0/0/0.200, 0 dependencies
traffic share 1
next hop 172.16.0.21, Serial0/0/0.200
valid adjacency
via 10.0.0.6, FastEthernet0/0, 0 dependencies
traffic share 1
next hop 10.0.0.6, FastEthernet0/0
valid adjacency
0 packets, 0 bytes switched through the prefix
tmstats: external 0 packets, 0 bytes
internal 0 packets, 0 bytes
Load distribution: 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 (refcount 1)
Hash OK Interface Address Packets
1 Y Serial0/0/0.100 point2point 0
2 Y Serial0/0/0.200 point2point 0
3 Y FastEthernet0/0 10.0.0.6 0
4 Y Serial0/0/0.100 point2point 0
5 Y Serial0/0/0.200 point2point 0
6 Y FastEthernet0/0 10.0.0.6 0
7 Y Serial0/0/0.100 point2point 0
8 Y Serial0/0/0.200 point2point 0
9 Y FastEthernet0/0 10.0.0.6 0
10 Y Serial0/0/0.100 point2point 0
11 Y Serial0/0/0.200 point2point 0
12 Y FastEthernet0/0 10.0.0.6 0
13 Y Serial0/0/0.100 point2point 0
14 Y Serial0/0/0.200 point2point 0
15 Y FastEthernet0/0 10.0.0.6 0
I just wanted to share with you what I've found today, which is strange behavior of IOS in my opinion.
Today I was analyzing a traffic flow for one of our customers, when I had to check the information in CEF table regarding 0.0.0.0/32 prefix. I was curious to get that information from CEF table because there is multipath BGP load balance:
tbirouter#show ip bgp 0.0.0.0/0
BGP routing table entry for 0.0.0.0/0, version 1774830
Paths: (2 available, best #1, table Default-IP-Routing-Table)
Multipath: eBGP
Not advertised to any peer
1234
Origin IGP, localpref 100, valid, external, multipath, best
5678
Origin IGP, localpref 100, valid, external, multipath
I decided to check what does CEF table says when I tried with these commands:
tbirouter#show ip cef 0.0.0.0 det
0.0.0.0/32, version 1, epoch 0, receive
tbirouter#show ip cef 0.0.0.0 int
0.0.0.0/32, version 1, epoch 0, receive
tbirouter#show ip cef 0.0.0.0/32 int
^
% Invalid input detected at '^' marker.
And here it comes the most interesting part of the story - my typo command which get me exactly what I needed.
show ip cef 0.0.0.032 internal <------------------ 0.0.0.032
0.0.0.0/0, version 1303100, epoch 0, per-destination sharing
0 packets, 0 bytes
via 2.2.2.20 0 dependencies, recursive
traffic share 1
next hop 2.2.2.2, GigabitEthernet0/0.3467 via 2.2.2.2/32
valid adjacency
via 1.1.1.1, 0 dependencies, recursive
traffic share 1
next hop 1.1.1.10, GigabitEthernet0/0.3197 via 1.1.1.1/32
valid adjacency
0 packets, 0 bytes switched through the prefix
tmstats: external 0 packets, 0 bytes
internal 0 packets, 0 bytes
Load distribution: 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 (refcount 1)
Hash OK Interface Address Packets
1 Y GigabitEthernet0/0.3467 2.2.2.2 0
2 Y GigabitEthernet0/0.3197 1.1.1.1 0
3 Y GigabitEthernet0/0.3467 2.2.2.2 0
4 Y GigabitEthernet0/0.3197 1.1.1.1 0
5 Y GigabitEthernet0/0.3467 2.2.2.2 0
6 Y GigabitEthernet0/0.3197 1.1.1.1 0
7 Y GigabitEthernet0/0.3467 2.2.2.2 0
8 Y GigabitEthernet0/0.3197 1.1.1.1 0
9 Y GigabitEthernet0/0.3467 2.2.2.2 0
10 Y GigabitEthernet0/0.3197 1.1.1.1 0
11 Y GigabitEthernet0/0.3467 2.2.2.2 0
12 Y GigabitEthernet0/0.3197 1.1.1.1 0
13 Y GigabitEthernet0/0.3467 2.2.2.2 0
14 Y GigabitEthernet0/0.3197 1.1.1.1 0
15 Y GigabitEthernet0/0.3467 2.2.2.2 0
16 Y GigabitEthernet0/0.3197 1.1.1.1 0
refcount 718612, covered prefixes:
Have you noticed this?
Kind regards,
Dani Petrov
I agree, it's confusing :-E
I am studying MPLS these days and there is a very basic thing which I am confused with ..I tried to find the details on Google but didnt get any good link ...Can somone plz explain me whats the difference between a ROUTING TABLE AND FIB ?
I would really be very obligied if somone can help :-[
http://blog.ioshints.info/2010/09/ribs-and-fibs.html
The same topic is also described in my MPLS/VPN Architectures book.
Though, thinking about in-order-delivery, I suppose fixing a flow to a specific path is a good thing. But what are the real risks, and incidence probability, if per packet load balancing is used.
say buckets 0 2 4 6 8 10 12 14 are for interface 1
and buckets 1 3 5 7 9 11 13 15 are for interface 2
if interface 2 goes down, then buckets will be freed and assigned to interface 1 or that buckets will not be utilized. if buckets are freed after how much time will it allocate to interface 2 .... or what is the bucket refresh time !
can u led me to source/rfc for the same !