Internet-in-a-VRF and LFIB Explosion
Matthew Stone encountered another unintended consequence of full Internet routing in a VRF design: the TCAM on his 6500 was 80% utilized even though he has the new Sup modules with one million IPv4 routes.
A closer look revealed the first clue: L3 forwarding resources on a Cat6500 are shared between IPv4 routes and MPLS labels (I don’t know about you, but I was not aware of that), and half the entries were consumed by MPLS labels:
L3 Forwarding Resources
FIB TCAM usage: Total Used %Used
72 bits (IPv4, MPLS, EoM) 1048576 843727 80%
144 bits (IP mcast, IPv6) 524288 11654 2%
288 bits (IPv6 mcast) 262144 3 1%
detail: Protocol Used %Used
IPv4 433781 41%
MPLS 409945 39%
EoM 1 1%
IPv6 11639 2%
IPv4 mcast 15 1%
IPv6 mcast 3 1%
What’s Up?
There’s a fundamental difference in the way MPLS assigns labels to BGP routes in different routing tables:
- MPLS labels are not assigned to BGP routes in the global routing table. When the router copies BGP routes from RIB into FIB, it uses the labels its downstream neighbor allocated to the BGP next hop. All BGP routes advertised by the same BGP next hop thus get the same label.
- A unique MPLS label is assigned to every VRF route when it’s imported into the VPNv4 address family. In the Internet-in-the-VRF design, the Internet edge PE-routers receive Internet routing through EBGP sessions running in a VRF, and those routes automatically appear in the VPNv4 address family (and get their labels) even if they are never propagated to other PE-routers.
Net result: if you have plenty of BGP routes in the global routing table (for example, around 450.000), your router allocates a local MPLS label for each BGP next hop. If those routes move to a VRF, your router allocates a local MPLS label for each route.
Why All the Fuss?
To make a long story short, the creators of the MPLS architecture wanted to minimize forwarding hardware requirements. Thus, they created a solution that ensures LSRs (including PE-routers) forward packets (both IPv4 and labeled packets) with a single lookup in a single table.
The proof is left as an exercise for the reader. I know a really good one, but it wouldn’t fit in the sidebar of this blog post.
Can We Fix It? Yes, We Can!
Wherever there’s a challenge, there’s a kludge. In this particular case, the magic command is mpls label mode vrf Internet protocol all-afs per-vrf. This command changes the label allocation mechanism from one-label-per-prefix to one-label-per-VRF.
With the changed label allocation model, the incoming label no longer uniquely identifies the outgoing interface and IP next hop. The egress PE-router thus has to perform two lookups: label lookup to identify the next lookup table (VRF FIB) and IPv4 destination address lookup in the VRF FIB.
The performance hit on the Cat 6500 seems to be minimal (at least the documentation claims so), but you lose the ability to do EIBGP multipathing (IPv4 lookup in the egress PE-router could lead to forwarding loops) and Carrier’s Carrier functionality (IPv4 lookup in the egress PE-router breaks the end-to-end LSP between CE-routers) in the VRFs for which you’ve configured per-VRF label allocation.
Next: Per-Prefix and Per-VRF MPLS/VPN and EVPN Labels/VNIs Continue
qoute from: http://www.cisco.com/en/US/docs/ios_xr_sw/iosxr_r3.6/routing/configuration/guide/rc36bgp.html#wpmkr1456095
label-allocation-mode per-ce
Configures the per-CE label allocation mode to avoid an extra lookup on the PE router and conserve label space (per-prefix is the default label allocation mode). In this mode, the PE router allocates one label for every immediate next-hop (in most cases, this would be a CE router). This label is directly mapped to the next hop, so there is no VRF route lookup performed during data forwarding. However, the number of labels allocated would be one for each CE rather than one for each VRF. Because BGP knows all the next hops, it assigns a label for each next hop (not for each PE-CE interface). When the outgoing interface is a multiaccess interface and the media access control (MAC) address of the neighbor is not known, Address Resolution Protocol (ARP) is triggered during packet forwarding.
mpls label mode all-vrfs protocol bgp-vpnv4 per-vrf
On 12.2SR code the command is:
mpls label mode all-vrfs protocol all-afs per-vrf
http://www.cisco.com/en/US/docs/routers/asr1000/release/notes/asr1k_feats_important_notes_310s.html#wp3378629
then {
label-allocation per-nexthop;
community add vpn1;
accept;
}
I don't know why this wasn't the default/standard from all vendors right from the start..