Category: BGP
Can We Trust BGP Next Hops (Part 2)?
Two weeks ago I started with a seemingly simple question:
If a BGP speaker R is advertising a prefix A with next hop N, how does the network know that N is actually alive and can be used to reach A?
… and answered it for the case of directly-connected BGP neighbors (TL&DR: Hope for the best).
Jeff Tantsura provided an EVPN perspective, starting with “the common non-arguable logic is reachability != functionality”.
Now let’s see what happens when we add route reflectors to the mix. Here’s a simple scenario:
Response: Next-Hop and VTEP Reachability in EVPN Networks
Jeff Tantsura published a great response to my Can We Trust BGP Next Hops blog post on LinkedIn, and I asked him for permission to save it in a more permanent form. Here it is (slightly edited)…
I’d like to bring back EVPN context. The discussion is more nuanced, the common non-arguable logic here - reachability != functionality.
Building BGP Route Reflector Configuration with Ansible/Jinja2
One of our subscribers sent me this email when trying to use ideas from Ansible for Networking Engineers webinar to build BGP route reflector configuration:
I’m currently discovering Ansible/Jinja2 and trying to create BGP route reflector configuration from Jinja2 template using Ansible playbook. As part of group_vars YAML file, I wish to list all route reflector clients IP address. When I have 50+ neighbors, the YAML file gets quite unreadable and it’s hard to see data model anymore.
Whenever you hit a roadblock like this one, you should start with the bigger picture and maybe redefine the problem.
Can We Trust BGP Next Hops (Part 1)?
Aldrin sent me an interesting question as a comment to one of my EVPN blog posts:
How does the network know that a VTEP is actually alive? (1) from the point of view of the control plane and (2) from the point of view of the data plane? And how do you ensure that control and data plane liveness monitoring has the same view? BFD for BGP is a possible solution for (1) but it’s not meant for 3rd party next hops, i.e. it doesn’t address (2).
Let’s stop right there (or you’ll stop reading in the next 10 milliseconds). I will also try to rephrase the question in more generic terms, hoping Aldrin won’t mind a slight detour… we’ll get back to the original question in another blog post.
MUST READ: Using BGP RPKI for a Safer Internet
As I explained in How Networks Really Work and Upcoming Internet Challenges webinars, routing security, and BGP security in particular remain one of the unsolved challenges we’ve been facing for decades (see also: what makes BGP a hot mess).
Fortunately, due to enormous efforts of a few persistent individuals BGP RPKI is getting traction (NTT just went all-in), and Flavio Luciani and Tiziano Tofoni decided to do their part creating an excellent in-depth document describing BGP RPKI theory and configuration on Cisco- and Juniper routers.
There are only two things you have to do:
- Read the document;
- Implement RPKI in your network.
Thank you, the Internet will be grateful.
Video: FRRouting Deployment Guidelines
After describing the FRRouting architecture, as well as recent performance optimizations and usability enhancements, Donald Sharp concluded the FRRouting webinar with detailed deployment guidelines.
Video: FRRouting Usability Enhancements
After covering configuration and performance optimizations introduced in recent FRRouting releases, Donald Sharp focused on some of the recent usability enhancements, including BGP BestPath explanations, BGP Hostname, BGP Failed Neighbors, and improved debugging.
Video: FRRouting Configuration and Performance Optimizations
After introducing FRRouting architecture, Donald Sharp dived deep into configuration and performance optimizations, including asynchronous data plane, next-hop groups, and commit-and-rollback.
Podcast: BGP in Public Cloud Revisited
After my response to the BGP is a hot mess topic, Corey Quinn graciously invited me to discuss BGP issues on his podcast. It took us a long while to set it up, but we eventually got there… and the results were published last week. Hope you’ll enjoy our chat.
The EVPN/EBGP Saga Continues
Aldrin wrote a well-thought-out comment to my EVPN Dilemma blog post explaining why he thinks it makes sense to use Juniper’s IBGP (EVPN) over EBGP (underlay) design. The only problem I have is that I forcefully disagree with many of his assumptions.
He started with an in-depth explanation of why EBGP over directly-connected interfaces makes little sense:
Video: FRRouting Architecture
After a brief overview of FRRouting suite Donald Sharp continued with a deep dive into FRR architecture, including the various routing daemons, role of Zebra and ZAPI, interface between RIB (Zebra) and FIB (Linux Kernel), sample data flow for route installation, and multi-threading in Zebra and BGP daemons.
Video: FRRouting Overview
In October 2019, Donald Sharp did a short webinar describing FRRouting, the hottest open-source routing suite.
As always, he started with an overview of what FRRouting is, and where you could use it.
EVPN Route Targets, Route Distinguishers, and VXLAN Network IDs
Got this interesting question from one of my readers:
BGP EVPN message carries both VNI and RT. In importing the route, is it enough either to have VNI ID or RT to import to the respective VRF?. When importing routes in a VRF, which is considered first, RT or the VNI ID?
A bit of terminology first (which you’d be very familiar with if you ever had to study how MPLS/VPN works):
BGP- and Car Safety
The Facts and Fiction: BGP Is a Hot Mess blog post generated tons of responses, including a thoughtful tweet from Laura Alonso:
Is your argument that the technology works as designed and any issues with it are a people problem?
A polite question like that deserves more than 280-character reply, but I tried to do my best:
BGP definitely works even better than designed. Is that good enough? Probably, and we could politely argue about that… but the root cause of most of the problems we see today (and people love to yammer about) is not the protocol or how it was designed but how sloppily it’s used.
Laura somewhat disagreed with my way of handling the issue:
Tuning BGP Convergence in High-Availability Firewall Cluster Design
Two weeks ago Nicola Modena explained how to design BGP routing to implement resilient high-availability network services architecture. The next step to tackle was obvious: how do you fine-tune convergence times, and how does BGP convergence compare to the more traditional FHRP-based design.