BGP Session Security: Be Very Skeptical
A while ago I explained how Generalized TTL Security Mechanism could be used to prevent denial-of-service attacks on routers running EBGP. Considering the results published in Analyzing the Security of BGP Message Parsing presentation from DEFCON 31 I started wondering how well GTSM implementations work.
TL&DR summary:
- The authors of the DEFCON 31 presentation fuzzed BGP OPEN messages.
- Most BGP routers (apart from Cisco IOS) accepted incoming TCP sessions on port 179 from IP addresses that were not configured as BGP neighbors.
- Some BGP implementations went as far as processing BGP OPEN messages before saying “go away, I don’t know you.” I would say that’s equivalent to picking up a USB stick in the parking lot and checking its contents.
Considering the above results, can we trust that the vendors do the right thing and drop TCP packets with destination port 179 and too-low TTL before they reach the control plan or (worst case) the BGP daemon? I started wondering about that.
Fortunately, it’s pretty easy to check GTSM-related behavior of a particular BGP implementation:
- Start a lab with two BGP nodes
- Configure GTSM on one of them
- Reset the BGP session to make sure GTSM applies to the session setup process (or not)
- The session should be stuck in ACTIVE state. Being able to proceed beyond ACTIVE state indicates that the GTSM implementation is
brokensuboptimal.
I started the Protect EBGP Sessions lab exercise for a quick check of FRR behavior. The lab exercise pre-configures GTSM on Cumulus Linux1 and a standard EBGP neighbor on the user device (Arista cEOS in my case). This is what I got on Arista cEOS:
rtr#sh ip bgp sum
BGP summary information for VRF default
Router identifier 10.0.0.1, local AS number 65000
Neighbor Status Codes: m - Under maintenance
Description Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc
x1 10.1.0.2 4 65100 15 22 0 95 00:00:36 OpenConfirm
x2 10.1.0.6 4 65101 10 8 0 0 00:00:34 Connect
The BGP session was in CONNECT state which means that:
- The TCP SYN packet was accepted by FRR even though its TTL was incorrect.
- FRR completed the TCP session establishment process
- BGP OPEN message with incorrect TTL was obviously dropped (the session was stuck in the CONNECT phase), but the TCP session was not torn down.
On the other side, FRR reported the BGP session being stuck in the OPENSENT state:
x1# sh ip bgp sum
IPv4 Unicast Summary:
BGP router identifier 10.0.0.10, local AS number 65100 vrf-id 0
BGP table version 2
RIB entries 3, using 600 bytes of memory
Peers 1, using 23 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt
10.1.0.1 4 65000 3 11 0 0 0 00:00:40 OpenSent 0
Total number of neighbors 1
FRR obviously:
- Completed TCP session setup even though the incoming TCP SYN packet had incorrect (too low) TTL
- Sent the BGP OPEN message but never processed the answer (thus the OpenSent state)
I’m using Cumulus Linux 4.x for the external BGP speakers in the BGP labs, and it could be that the FRR team improved GTSM behavior in the recent versions of FRR, so I restarted the labs using FRR 9.0.1. I got the exact same behavior.
Other Platforms
This blog post describes a proof-of-concept procedure you can use to test GTSM behavior on platforms you’re interested in. I will not waste my time running those tests, but if you get interesting results please leave a comment.
More Information
- I got the link to the DEFCON 31 presentation from the lovely SINOG 7.0 The beautiful mess that is BGP presentation by Emile Aben
- Check out the Internet Routing Security webinar if you want to know more about BGP security.
- For an overview of what can go wrong with BGP watch the Internet Routing Security part of Network Security Fallacies section of How Networks Really Work.
- Want to get your hands dirty? Do the Protect EBGP Sessions lab exercise (part of ipSpace.net BGP Configuration Labs).
-
Cumulus Linux uses FRR as its BGP routing daemon ↩︎
It may interest readers that the GTSM RFC5082 specifies a sending TTL of 255, where one might have expected a TTL equal to the maximum number of hops acceptable
This implies a TTL of 253 is considered "too low" for a standard directly connected EBGP peering session
One might be able to combine an inbound interface ACL with the GTSM idea. If the system allows to match on TTL in combination with IP, protocol (TCP), and port, this could be used to drop packets with too low TTL value. GTSM still needs to be enabled so that the BGP speakers use high instead of low TTL values.
This is not perfect, e.g., it only works on one of the two BGP speakers of a session (the one that answers a SYN sent to TCP port 179), but it could help against some random attacker from across the Internet intending to send a crafted OPEN message for remote code execution.
I hoped that GTSM would be implemented in CoPP ACL (the only place where it would make sense) or failing that in iptables in Linux-based devices. Looks like I was way too optimistic (again).
As for "it works on one of the two BGP speakers", you can drop packets when the source or destination port is 179 and TTL is too low, and allow all other packets with source or destination port 179 (or just let them through)
Thanks, legitimate BGP packets entering a BGP speaker have either a source or destination port of 179, this takes care of both directions.
Back in the day, I had hoped that SSH service ACLs were implemented in some kind of control plane ACL. But then I tested it on different devices: some would look at the source address only after starting SSH session establishment. Thus service ACLs would not protect from vulnerabilities in the SSH session establishment code. I would expect similar problems might occur with any service on a router.
(I still use service ACLs for hardening, I see it as one element in a defense in depth approach.)
Pardon my ignorance (my knowledge of BGP is very rusty), but this statement:
"Most BGP routers (apart from Cisco IOS) accepted incoming TCP sessions on port 179 from IP addresses that were not configured as BGP neighbors."
I'd think it'd be obvious for BGP routers to only accept incoming sessions from configured BGP neighbors, right? Because BGP is the most critical infrastructure, the backbone of the Internet, why would you want your router to accept incoming session from anyone but KNOWN sources? What's the rationale?
Most network devices these days run on Linux (or xBSD). The xNIX TCP stack cannot filter incoming sessions based on source IP addresses, you'd have to deploy iptables (or equivalent) filters to get that done.
It looks like no networking vendor cares enough to get that done (or even better: deploy BGP protection ACLs in hardware), and the end-users are not screaming loud enough to force them to do so.
Cisco IOS is different because it uses home-grown TCP stack.