BGP Session Security: Be Very Skeptical

A while ago I explained how Generalized TTL Security Mechanism could be used to prevent denial-of-service attacks on routers running EBGP. Considering the results published in Analyzing the Security of BGP Message Parsing presentation from DEFCON 31 I started wondering how well GTSM implementations work.

TL&DR summary:

  • The authors of the DEFCON 31 presentation fuzzed BGP OPEN messages.
  • Most BGP routers (apart from Cisco IOS) accepted incoming TCP sessions on port 179 from IP addresses that were not configured as BGP neighbors.
  • Some BGP implementations went as far as processing BGP OPEN messages before saying “go away, I don’t know you.” I would say that’s equivalent to picking up a USB stick in the parking lot and checking its contents.

Considering the above results, can we trust that the vendors do the right thing and drop TCP packets with destination port 179 and too-low TTL before they reach the control plan or (worst case) the BGP daemon? I started wondering about that.

Checking TTL after the TCP session has been established is useless from the DoS prevention perspective. The control-plane CPU cycles have already been wasted.

Fortunately, it’s pretty easy to check GTSM-related behavior of a particular BGP implementation:

  • Start a lab with two BGP nodes
  • Configure GTSM on one of them
  • Reset the BGP session to make sure GTSM applies to the session setup process (or not)
  • The session should be stuck in ACTIVE state. Being able to proceed beyond ACTIVE state indicates that the GTSM implementation is broken suboptimal.

I started the Protect EBGP Sessions lab exercise for a quick check of FRR behavior. The lab exercise pre-configures GTSM on Cumulus Linux1 and a standard EBGP neighbor on the user device (Arista cEOS in my case). This is what I got on Arista cEOS:

rtr#sh ip bgp sum
BGP summary information for VRF default
Router identifier 10.0.0.1, local AS number 65000
Neighbor Status Codes: m - Under maintenance
  Description              Neighbor V AS           MsgRcvd   MsgSent  InQ OutQ  Up/Down State   PfxRcd PfxAcc
  x1                       10.1.0.2 4 65100             15        22    0   95 00:00:36 OpenConfirm
  x2                       10.1.0.6 4 65101             10         8    0    0 00:00:34 Connect

The BGP session was in CONNECT state which means that:

  • The TCP SYN packet was accepted by FRR even though its TTL was incorrect.
  • FRR completed the TCP session establishment process
  • BGP OPEN message with incorrect TTL was obviously dropped (the session was stuck in the CONNECT phase), but the TCP session was not torn down.

On the other side, FRR reported the BGP session being stuck in the OPENSENT state:

x1# sh ip bgp sum

IPv4 Unicast Summary:
BGP router identifier 10.0.0.10, local AS number 65100 vrf-id 0
BGP table version 2
RIB entries 3, using 600 bytes of memory
Peers 1, using 23 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt
10.1.0.1        4      65000         3        11        0    0    0 00:00:40     OpenSent        0

Total number of neighbors 1

FRR obviously:

  • Completed TCP session setup even though the incoming TCP SYN packet had incorrect (too low) TTL
  • Sent the BGP OPEN message but never processed the answer (thus the OpenSent state)

I’m using Cumulus Linux 4.x for the external BGP speakers in the BGP labs, and it could be that the FRR team improved GTSM behavior in the recent versions of FRR, so I restarted the labs using FRR 9.0.1. I got the exact same behavior.

Other Platforms

This blog post describes a proof-of-concept procedure you can use to test GTSM behavior on platforms you’re interested in. I will not waste my time running those tests, but if you get interesting results please leave a comment.

More Information


  1. Cumulus Linux uses FRR as its BGP routing daemon ↩︎

3 comments:

  1. It may interest readers that the GTSM RFC5082 specifies a sending TTL of 255, where one might have expected a TTL equal to the maximum number of hops acceptable

    This implies a TTL of 253 is considered "too low" for a standard directly connected EBGP peering session

  2. One might be able to combine an inbound interface ACL with the GTSM idea. If the system allows to match on TTL in combination with IP, protocol (TCP), and port, this could be used to drop packets with too low TTL value. GTSM still needs to be enabled so that the BGP speakers use high instead of low TTL values.

    This is not perfect, e.g., it only works on one of the two BGP speakers of a session (the one that answers a SYN sent to TCP port 179), but it could help against some random attacker from across the Internet intending to send a crafted OPEN message for remote code execution.

    Replies
    1. I hoped that GTSM would be implemented in CoPP ACL (the only place where it would make sense) or failing that in iptables in Linux-based devices. Looks like I was way too optimistic (again).

      As for "it works on one of the two BGP speakers", you can drop packets when the source or destination port is 179 and TTL is too low, and allow all other packets with source or destination port 179 (or just let them through)

    2. Thanks, legitimate BGP packets entering a BGP speaker have either a source or destination port of 179, this takes care of both directions.

      Back in the day, I had hoped that SSH service ACLs were implemented in some kind of control plane ACL. But then I tested it on different devices: some would look at the source address only after starting SSH session establishment. Thus service ACLs would not protect from vulnerabilities in the SSH session establishment code. I would expect similar problems might occur with any service on a router.

      (I still use service ACLs for hardening, I see it as one element in a defense in depth approach.)

  3. Pardon my ignorance (my knowledge of BGP is very rusty), but this statement:

    "Most BGP routers (apart from Cisco IOS) accepted incoming TCP sessions on port 179 from IP addresses that were not configured as BGP neighbors."

    I'd think it'd be obvious for BGP routers to only accept incoming sessions from configured BGP neighbors, right? Because BGP is the most critical infrastructure, the backbone of the Internet, why would you want your router to accept incoming session from anyone but KNOWN sources? What's the rationale?

    Replies
    1. Most network devices these days run on Linux (or xBSD). The xNIX TCP stack cannot filter incoming sessions based on source IP addresses, you'd have to deploy iptables (or equivalent) filters to get that done.

      It looks like no networking vendor cares enough to get that done (or even better: deploy BGP protection ACLs in hardware), and the end-users are not screaming loud enough to force them to do so.

      Cisco IOS is different because it uses home-grown TCP stack.

Add comment
Sidebar