BGP Configuration Made Simple with Cumulus Linux
BGP is without doubt the most scalable routing protocol, which made it a popular choice for large-scale deployments from service provider networks to enterprise WAN/VPN networks and even data centers. Its only significant drawback is the tedious configuration process (which almost reminds me of writing COBOL programs decades ago).
The Cumulus Networks routing team decided to change that and added numerous BGP configuration enhancements to Quagga (now FRR), the routing daemon used by Cumulus Linux. You might want to watch the Data Center Architectures video from Cumulus’ Network Field Day 9 presentation for the introduction to the topic before delving into the details.
Configure BGP neighbor on an interface
Here’s how you’d usually configure a BGP neighbor:
router bgp <as-number> neighbor <address> remote-as <as-number>
This is how you can do it with Quagga in Cumulus Linux:
router bgp <as-number> neighbor <interface> remote-as <as-number>
After configuring an interface neighbor, the router tries to figure out the neighbor’s IP address. That’s easy to do for IPv6 (use link-local addresses), and for interfaces with /30 or /31 IPv4 subnets… but Cumulus Linux allows you to run BGP (IBGP or EBGP) over unnumbered interfaces.
It was always possible to run IBGP over unnumbered interfaces: configure an IGP, run IBGP between loopback interfaces, and let IGP sort out how to exchange traffic between the BGP speakers. But how do you get EBGP to run over unnumbered interfaces without using IGP?
Cumulus routing team used one of the more obscure BGP-related RFCs to get the job done. RFC 5549 documents how you exchange IPv4 BGP prefixes with IPv6 next hops (for more details see the RIPE presentation by AMS-IX), and when you combine that with the ability to run EBGP across link-local IPv6 addresses, you get a one-line BGP neighbor configuration that is totally independent from the IPv4/IPv6 addressing in your network and thus easy to turn into a template and automate.
Final question: how do you get the LLA of your neighbor? Easy: listen to its ND or RA messages.
Notes and caveats:
- You can use the interface-based BGP neighbors with devices from other vendors if you use /30 or /31 subnet mask on the interface (in that case Cumulus Linux uses standard BGP);
- You don’t have to run IPv6 in your network to get this feature to work on unnumbered interfaces – just make sure you haven’t disabled IPv6 on the interfaces you want to use in BGP configuration;
- You can use this feature for IPv6 BGP sessions with any third-party box that supports BGP over LLA (Cisco’s IPv6/security guru Eric Vyncke published an RFC describing this idea… but unfortunately that had no impact on messy configuration you have to use in Cisco IOS to get it to work);
- If you want to use interface-based BGP neighbors over unnumbered IPv4 interfaces, the BGP neighbor has to support RFC 5549, and I’m not aware of any major vendor doing that;
- Obviously this feature works only over point-to-point interface. You’d need something like dynamic BGP neighbors from Cisco IOS for multi-access (example: mGRE) scenarios.
Getting rid of AS numbers
The other major pain of BGP configuration syntax (particularly when you’re trying to generate standard configuration templates) is the requirement to specify neighbor AS number in the neighbor statement. Here’s how Cumulus routing team solved that problem:
router bgp <as-number> neighbor <interface> remote-as internal neighbor <interface> remote-as external
The remote-as internal part of the neighbor statement is obvious – use my own BGP AS number. One has to wonder why nobody else is using this syntax; maybe they’re too busy copying industry-standard CLI.
The remote-as external feature is where Cumulus engineers got slightly creative. BGP speakers advertise their AS number in the BGP OPEN message, and the usual behavior is to close the session if the AS number in the incoming OPEN message doesn’t match the number configured on the BGP neighbor statement. Instead of that, the new code they wrote for Quagga ignores the strict AS check, accepts the AS number advertised by the neighbor, and uses it later on in the same way as if it would have been configured in the router configuration.
Regardless of what I think about the whole concept of whitebox switching, I love creative solutions;) It’s refreshing to see how startups with no legacy codebase to protect solve annoyances that have been bothering us for decades. Great job!
Disclosure: Cumulus Networks was indirectly covering some of the costs of my attendance at the Network Field Day 9 event. More…
What book(s) or other resource would you recommend to help learn BGP?
Not sure what to recommend these days, but then the basics of BGP haven't changed that much.
What if the other side has the same "external" configuration with no AS configured?
Secondly, does this mean that quagga will only listen to open messages but never send them? Because what would happen if it sent an open message w/o ASN?
Any comments on the number of ebgp routes a TOR white box should hold considering a medium-large data centre? Agree with vary will design..
Best case: one subnet per ToR switch.
Worst case: one host route per VM.