Dear Vendors, EVPN Route Attributes Matter

Another scary tale from the Archives of Sloppy Code: we can’t decide whether some attributes are mandatory or optional.

When I was fixing the errors in netlab SR-OS configuration templates, I couldn’t get the EBGP-based EVPN with overlapping leaf AS numbers to work. I could see the EVPN routes in the SR-OS BGP table, but the device refused to use them. I concluded (incorrectly) that there must be a quirk in the SR-OS EVPN code and moved on.

However, when I ran the full integration tests for all platforms impacted by release 25.06, ArubaCX and Dell OS10 failed the same tests, even though they passed them before. I learned (the hard way) to be very careful with my “let’s see what has changed” approach to troubleshooting1 and eventually figured out that the only thing that changed was the FRRouting software release I used for the other devices in the integration test.

Before diving into the details, here’s a quick overview of the lab topology:

  • VXLAN with EVPN control plane is used to extend a single VLAN between two leaf switches (L1, L2)
  • The leaf switches are connected to a single spine switch (L1 → spine → L2).
  • The three switches run EVPN over directly-connected EBGP sessions (the sane EBGP setup)
  • The leaf switches have the same AS number and have to use something like allowas-in to disable the AS-path checks.

Let’s skip the hero’s journey troubleshooting details and go straight to the results. Recreating the same lab with three FRR nodes (running release 10.3), let’s inspect the two EVPN type-3 routes2 generated by L1 and L2 on the spine switch:

EVPN type-3 routes on the spine switch
spine# show bgp l2vpn evpn route detail type multicast
Route Distinguisher: 10.0.0.2:1000
BGP routing table entry for 10.0.0.2:1000:[3]:[0]:[32]:[10.0.0.2]
Paths: (1 available, best #1)
  Advertised to non peer-group peers:
  l1(10.1.0.1) l2(10.1.0.5)
  Route [3]:[0]:[32]:[10.0.0.2]
  65200
    10.0.0.2(l1) from l1(10.1.0.1) (10.0.0.2)
      Origin IGP, valid, external, bestpath-from-AS 65200, best (First path received)
      Extended Community: RT:65000:1000 ET:8
      Last update: Fri Jun 13 15:05:59 2025
      PMSI Tunnel Type: Ingress Replication, label: 1000
Route Distinguisher: 10.0.0.3:1000
BGP routing table entry for 10.0.0.3:1000:[3]:[0]:[32]:[10.0.0.3]
Paths: (1 available, best #1)
  Advertised to non peer-group peers:
  l1(10.1.0.1) l2(10.1.0.5)
  Route [3]:[0]:[32]:[10.0.0.3]
  65200
    10.0.0.3(l2) from l2(10.1.0.5) (10.0.0.3)
      Origin IGP, valid, external, bestpath-from-AS 65200, best (First path received)
      Extended Community: RT:65000:1000 ET:8
      Last update: Fri Jun 13 15:05:59 2025
      PMSI Tunnel Type: Ingress Replication, label: 1000

The routes seem legit. They are also almost identical, but that’s to be expected, right? Now let’s look at the same routes on L1. One of the routes is generated by L1, the other is advertised by L2, and propagated by Spine over EBGP sessions. Can you spot the difference (ignoring RD, AS-path, and the like)?

EVPN type-3 routes on L1
l1# show bgp l2vpn evpn route detail type multicast
Route Distinguisher: 10.0.0.2:1000
BGP routing table entry for 10.0.0.2:1000:[3]:[0]:[32]:[10.0.0.2]
Paths: (1 available, best #1)
  Advertised to non peer-group peers:
  spine(10.1.0.2)
  Route [3]:[0]:[32]:[10.0.0.2] VNI 1000
  Local
    10.0.0.2(l1) from 0.0.0.0 (10.0.0.2)
      Origin IGP, weight 32768, valid, sourced, local, bestpath-from-AS Local, best (First path received)
      Extended Community: ET:8 RT:65000:1000
      Last update: Fri Jun 13 15:05:56 2025
      PMSI Tunnel Type: Ingress Replication, label: 1000
Route Distinguisher: 10.0.0.3:1000
BGP routing table entry for 10.0.0.3:1000:[3]:[0]:[32]:[10.0.0.3]
Paths: (1 available, best #1)
  Advertised to non peer-group peers:
  spine(10.1.0.2)
  Route [3]:[0]:[32]:[10.0.0.3]
  65100 65200
    10.0.0.3(spine) from spine(10.1.0.2) (10.0.0.1)
      Origin IGP, valid, external, bestpath-from-AS 65100, best (First path received)
      Extended Community: RT:65000:1000
      Last update: Fri Jun 13 15:06:01 2025
      PMSI Tunnel Type: Ingress Replication, label: 1000

Did you notice this tiny detail?

Local route
      Extended Community: ET:8 RT:65000:1000
Route propagated over two EBGP sessions
      Extended Community: RT:65000:1000

The ET attribute is missing from the remote EVPN route.

Next step: trying to figure out what the poet had in mind when he wrote ET (probably not this one). Digging through BGP Extended Communities described in RFC 8365, I finally discovered the BGP Encapsulation Extended Community with the BGP Tunnel Encapsulation Attribute Tunnel Type where value 8 means VXLAN3.

To recap

  • FRR release 10.3 attaches BGP Encapsulation = VXLAN extended community to EVPN routes
  • FRR release 10.2 propagates that community to other EBGP neighbors while FRR release 10.3 drops it. There is no good reason to do that, as the BGP next hop is unchanged. The device dropping the extended community is not in the VXLAN data path (it’s just forwarding IP) and thus has no business deciding what encapsulation to use.
  • ArubaCX, Dell OS10, and Nokia SR-OS, when configured to use VXLAN encapsulation with EVPN control plane, refuse to use EVPN routes without the BGP Encapsulation community.
  • It looks like all other EVPN implementations for which we implemented allowas-in functionality (Cumulus, EOS, Nexus OS, SR Linux, VyOS) ignore the lack of that community and happily use the routes without them.

OK, one set of vendors must be wrong, right? Welcome to the wonderfully vague world of EVPN. Here are a few choice morsels from that wonderful RFC:

Section 5.1.3: If the BGP Encapsulation Extended Community is not present, then either MPLS encapsulation or a statically configured encapsulation is assumed.

I’m reading the above as “the community is optional” but also “all bets are off.”

Section 6: An ingress NVE can send a frame to an egress NVE only if the set of encapsulations advertised by the egress NVE forms a non-empty intersection with the set of encapsulations supported by the ingress NVE.

Now it seems like the BGP Encapsulation community is mandatory. However, the same paragraph contains this gem:

If the BGP Encapsulation extended community is not present, then the default MPLS encapsulation or a locally configured encapsulation is assumed

Finally, the wonderfully vague cherry on the cake that concludes Section 6:

It is the responsibility of the operator of a given EVI to ensure that all of the NVEs in that EVI support at least one common encapsulation. If this condition is violated, it could result in service disruption or failure. The use of the BGP Encapsulation Extended Community provides a method to detect when this condition is violated, but the actions to be taken are at the discretion of the operator and are outside the scope of this document.

The more I’m forced to dig into the details of EVPN, the more they prove my gut feeling that we’re dealing with the SIP of networking. Do I need to keep elaborating on why I would never recommend to build a production EVPN network with devices from more than one vendor?


  1. This time, the answer was “nothing” – the device templates or the integration test haven’t changed in months. ↩︎

  2. The routes used to build the list of remote VTEPs for the ingress packet replication ↩︎

  3. When squinting just right, I could even see a vague semblance between the name of a value in that BGP community and the acronym used in FRRouting. ↩︎

Add comment
Sidebar