Thursday, April 4, 2024 08:22 CEST

BGP Challenge: Build BGP-Free MPLS Core Network

Here’s another challenge for BGP aficionados: build an MPLS-based transit network without BGP running on core routers.

That should be an easy task if you configured MPLS in the past, so try to spice it up a bit:

Use SR/MPLS instead of LDP
Do it on a platform you’re not familiar with (hint: Arista vEOS is a bit different from Cisco IOS)
Try to get it running on FRR containers.

Explore the lab exercise

BGP
netlab

Latest blog posts in this series

BGP Challenge: Build BGP-Free MPLS Core Network (this post)
BGP Labs: Advertise the Default Route
BGP Labs: Stop the Fat-Finger Incidents
BGP Labs: Limit the Number of Accepted BGP Prefixes
BGP Labs: Policy Templates
BGP Labs: Remove Private AS from AS-Path
BGP Labs: Session Templates
BGP Labs: Use Multiple AS Numbers on the Same Router
BGP Labs: Override Neighbor AS Number in AS Path
BGP Labs: Work with FRR and Cumulus Linux

1 comments:

Henk 04 April 2024 05:49

I think you got it wrong, Ivan.

The goal is not to build anything BGP-free. That is heresy!
The goal is that BGP replaces everything else. Just the opposite of what you do here.
In the end, only BGP will survive!!

All hail BGP.

Replies

Ivan Pepelnjak 04 April 2024 05:53

The hardware vendors must love you ;) Imagine having a million routes on every high-speed switch ;))

Daryll Swer 05 April 2024 06:31

eBGP driven networks (MPLS included) have only a few (or rather necessary) routes in each device all the way down, even in large scale networks. Aggregation of the routed blocks takes place in each hierarchy from edge all the down, routes are respectively aggregate blackhole in each device going downwards from the edge. Easier with IPv6-native networks, though. Default routes are used for egress back up.

Let's say I have a /20 v4 range and a /32 v6 range blackholed aggregate on my edge router, from there let's say I route a /24 and /48 downwards to my core router, that /24, /48 is blackhole aggregated on the core, and now let's say from the layer 3 distribution (downstream of my core), I want to route a /25 to rack05 and a /25 to rack06 and same thing for two /49s, the more specific aggregate are blackholed on my layer 3 distribution before finally being routed to where they are needed (also over eBGP and also blackholed on the “destination” node).

So edge has full tables, static blackholes, /20, /32 and one /24 and /48 from internal-AS, that's only two BGP routes from internal-AS, my core has only two BGP routes learnt from downstream L3 dist. peer, it has only one BGP route from the edge, which is a default route for egress back up. My L3 dist has only four BGP routes learnt over BGP from the destination node, one default route for egress back up to the core.

I don't know why there's a misconception that BGP driven networks have full table dumps everywhere, this is false. Full tables are limited only to edge routers (Border router, DFZ-Facing router).

Let's use your diagram example from this article itself: Let us assume: PE1 is in Site01 P (core) is in Site02 PE2 is in Site03

Objective is Pseudowire for E1<>E2

PE1 <> PE2 distance is about 1 Kilometre.

LDP enabled. Single-area OSPF (with BFD probably on directly connected interfaces) across all three devices (PE, P), they learn each other's loopback IPs.

eBGP peer between PE1 (AS4200000000) and PE2 (4200000001) using source loopback and dst also loopback, use the BGP to signal VPLS.

Routing Table on PE1 (PE2 just flip device names): Directly Connected addressing route between PE1 and P Loopback IP of the P Loopback IP of PE2

of routes, 3

Routing Table on P: Directly Connected addressing route between P and PE1 Directly Connected addressing route between P and PE2 Loopback IP of PE1 Loopback IP of PE2

of routes, 4

May interest some folks: I was once sent a config dump of a Tier 1 carrier's PE router and figured out a way to simplify the design further with eBGP. I.e. if your customer is paying for DIA/IP Transit, you could remove route reflectors completely, by running an eBGP signalled pseudowire from the PE router sitting at a CO or cell site, the pseudowire rides over your LSP core, it finally terminates on the DFZ-facing edge router, customer gets seamless L2 directly to the edge, this way you can dump them full tables without any complexity.

I once dealt with DIA/IP Transit port from Vodafone in Spain, and they didn't have this design and required us to mess with different BGP sessions for default route and separate session for full tables over a routed /32 and /128 from their side to a router, which was then peered with a different route reflector on their end, horrible mess, took them about a week to get the whole thing working. With my approach, one BGP session per address family for customer POV and also eliminates MTU mess-ups, assuming that you the architect made it company-wide policy that no device in an LSP does L3 MTU lower than 9k and L2 MTU lower than 9216 — Another story, we've had a carrier sell us L2VPN transport whereby on the primary path they delivered 9k as requested, but came a fibre cut, and the protection path had 1400 MTU on L3, you can imagine us scratching heads for 30 mins on why our BGP peer over this session wouldn't come up.

Side note: For L2VPN services, it'd be nice if carriers enabled this by default, instead of waiting for customer to request it: https://www.juniper.net/documentation/us/en/software/junos/cli-reference/topics/ref/statement/forwarding-options-l2circuit-control-passthrough.html

Add comment

Latest blog posts in this series

Recent posts in the same categories

BGP

netlab

1 comments:

of routes, 3

of routes, 4