Internet-in-a-VRF and LFIB Explosion

This blog post is over a decade old. I hope the only time you’ll see a Catalyst 6500 is in a computer museum (although I’ve heard some organizations still run them in production networks), and the full BGP table has almost a million entries. However, the underlying dilemma is as relevant as it was in those days; we still don’t have infinite forwarding tables.

Wednesday, February 13, 2013 07:40 +0100

Internet-in-a-VRF and LFIB Explosion

Matthew Stone encountered another unintended consequence of full Internet routing in a VRF design: the TCAM on his 6500 was 80% utilized even though he has the new Sup modules with one million IPv4 routes.

A closer look revealed the first clue: L3 forwarding resources on a Cat6500 are shared between IPv4 routes and MPLS labels (I don’t know about you, but I was not aware of that), and half the entries were consumed by MPLS labels:

L3 Forwarding Resources
   FIB TCAM usage:                     Total        Used       %Used
       72 bits (IPv4, MPLS, EoM)     1048576      843727         80%
       144 bits (IP mcast, IPv6)      524288       11654          2%
       288 bits (IPv6 mcast)          262144           3          1%

   detail:      Protocol                    Used       %Used
                IPv4                      433781         41%
                MPLS                      409945         39%
                EoM                            1          1%

                IPv6                       11639          2%
                IPv4 mcast                    15          1%
                IPv6 mcast                     3          1%

What’s Up?

There’s a fundamental difference in the way MPLS assigns labels to BGP routes in different routing tables:

MPLS labels are not assigned to BGP routes in the global routing table. When the router copies BGP routes from RIB into FIB, it uses the labels its downstream neighbor allocated to the BGP next hop. All BGP routes advertised by the same BGP next hop thus get the same label.
A unique MPLS label is assigned to every VRF route when it’s imported into the VPNv4 address family. In the Internet-in-the-VRF design, the Internet edge PE-routers receive Internet routing through EBGP sessions running in a VRF, and those routes automatically appear in the VPNv4 address family (and get their labels) even if they are never propagated to other PE-routers.

Net result: if you have plenty of BGP routes in the global routing table (for example, around 450.000), your router allocates a local MPLS label for each BGP next hop. If those routes move to a VRF, your router allocates a local MPLS label for each route.

Why All the Fuss?

To make a long story short, the creators of the MPLS architecture wanted to minimize forwarding hardware requirements. Thus, they created a solution that ensures LSRs (including PE-routers) forward packets (both IPv4 and labeled packets) with a single lookup in a single table.

The proof is left as an exercise for the reader. I know a really good one, but it wouldn’t fit in the sidebar of this blog post.

Can We Fix It? Yes, We Can!

Wherever there’s a challenge, there’s a kludge. In this particular case, the magic command is mpls label mode vrf Internet protocol all-afs per-vrf. This command changes the label allocation mechanism from one-label-per-prefix to one-label-per-VRF.

With the changed label allocation model, the incoming label no longer uniquely identifies the outgoing interface and IP next hop. The egress PE-router thus has to perform two lookups: label lookup to identify the next lookup table (VRF FIB) and IPv4 destination address lookup in the VRF FIB.

The performance hit on the Cat 6500 seems to be minimal (at least the documentation claims so), but you lose the ability to do EIBGP multipathing (IPv4 lookup in the egress PE-router could lead to forwarding loops) and Carrier’s Carrier functionality (IPv4 lookup in the egress PE-router breaks the end-to-end LSP between CE-routers) in the VRFs for which you’ve configured per-VRF label allocation.

Next: Per-Prefix and Per-VRF MPLS/VPN and EVPN Labels/VNIs Continue

10 comments:

Roman Sokolov 13 February 2013 09:45

Don't forget about VPN CAM when using per-vrf labels. Or you will hit recirculation for overflowing vrfs (more than 512).

Fredrik Westermark 13 February 2013 10:28

Love these types of solid and concrete posts!

Anonymous 13 February 2013 19:53

How does JUNOS handle this?

Tiziano 14 February 2013 14:32

vrf-table-label

Anonymous 15 February 2013 07:29

Assigning an MPLS label per next-hop would be nice middle-of-the-road solution. No need to do double lookups and save lot of memory.

Anonymous 16 February 2013 14:19

Already happened to face and solve the issue, but the post is great for sure.

Mijo 18 February 2013 00:31

On ASR9K/XR theres a third solution which avoids the extra lookup, you can do per next-hop label allocation

qoute from: http://www.cisco.com/en/US/docs/ios_xr_sw/iosxr_r3.6/routing/configuration/guide/rc36bgp.html#wpmkr1456095

label-allocation-mode per-ce

Configures the per-CE label allocation mode to avoid an extra lookup on the PE router and conserve label space (per-prefix is the default label allocation mode). In this mode, the PE router allocates one label for every immediate next-hop (in most cases, this would be a CE router). This label is directly mapped to the next hop, so there is no VRF route lookup performed during data forwarding. However, the number of labels allocated would be one for each CE rather than one for each VRF. Because BGP knows all the next hops, it assigns a label for each next hop (not for each PE-CE interface). When the outgoing interface is a multiaccess interface and the media access control (MAC) address of the neighbor is not known, Address Resolution Protocol (ARP) is triggered during packet forwarding.

Andras Toth 12 March 2013 00:52

This is available from 12.2(33)SXH and later on Cat 6500 and the command is the following:
mpls label mode all-vrfs protocol bgp-vpnv4 per-vrf

On 12.2SR code the command is:
mpls label mode all-vrfs protocol all-afs per-vrf

Anonymous 31 July 2013 16:00

IOS XE3.10S (ASR1k) "per-CE label allocation":

http://www.cisco.com/en/US/docs/routers/asr1000/release/notes/asr1k_feats_important_notes_310s.html#wp3378629

Anonymous 08 January 2014 08:56

Juniper does per next-hop (like per-ce) as well. Configure your export policy like this:
then {
label-allocation per-nexthop;
community add vpn1;
accept;
}

I don't know why this wasn't the default/standard from all vendors right from the start..

What’s Up?

Why All the Fuss?

Can We Fix It? Yes, We Can!

Recent posts in the same categories

Internet

BGP

MPLS VPN

10 comments: