Running BGP Route Reflector in a Virtual Machine

The BGP-based SDN Solutions webinar triggered another interesting question from one the attendees:

It seems like the BGP route reflector functionality can be implemented as a Virtual Machine. Will a VM have enough resources to meet the requirements of a RR?

Short answer: Yes.

The only resources a BGP route reflector needs are CPU and memory. The networking requirements aren't that critical - the BGP sessions are more likely to be CPU-bound than bandwidth-bound.

Considering this, it makes little sense to run a BGP route reflector on a router or switch - you're wasting valuable forwarding hardware to provide functionality that could be implemented on any x86 server. IXP operators realized that a long time ago; many IXPs use Bird or Quagga as route server.

The service providers offering more than just IP connectivity couldn't use the same trick. Most open-source BGP implementations don't support the additional address families needed to implement L2VPNs or L3VPNs. However, it seems the large customers managed to push the traditional networking vendors hard enough - Cisco, Juniper and Alcatel Lucent are offering BGP router reflector functionality in a VM format - from Cisco's Cloud Services Router (CSR 1000v) and IOS XRv to Juniper's Virtual Route Reflector and ALU's VSR-RR.

Summary: if you want to implement BGP route reflector functionality outside of the regular forwarding path (as a dedicated control-plane function), don't waste money buying a router to do it. Use either an open-source solution or network operating system in VM format.

More information

BGP-Based SDN Solutions webinar describes numerous BGP-based SDN solutions and open-source tools you could use to build another one.

8 comments:

  1. I think RR as a VM makes a lot of sense. With traditional IP forwarding we had the possibility of route deflection so the RR's should be placed in path of the topology.

    At this point in time I would assume that all ISP's do label switched forwarding for Internet traffic also though so it should not be an issue even in that case.

    MPLS VPNs don't have this problem of course since the egress exit point is determined at ingress and there is no hop by hop decision on forwarding.

    By running it in a VM you could shift the workload while you are patching one of the RR's. It's trivial to add memory and CPU which would not be the case for a physical router.

    The drawback seems to be the licensing which is more expensive than one would imagine (isn't that always the case?).
  2. Decoupling the compute-centric function of RR from the expensive forwarding hardware is sensible, but the assumption that it's also then right to go all-in centralised rather than distributed can be more of a per-address family question.

    It works for private VPN address families where NLRIs are relatively unique, but BGP's lack of ability to communicate more than the active path is a hindrance for the Internet table, since any moderately geographically-diverse network with peering ends up requiring quite localised route selection.

    BGP ORR is an interesting option to address this problem, but I can't help thinking that the fixation on centralisation of what is a crucially a distributed compute problem is the main issue.
    Replies
    1. BGP Add-path could be a tool to use for this, no?

      Not that I am advocating a centralised RR setup.
    2. Add path could definitely be used. There is an interesting presentation by Mark Tinka here:

      http://www.slideshare.net/mynog/21st-centuryibgp-routereflectionmarktinka
  3. the big issue is still fault domain separations. if the vrr is down you need a team to check the hypervisor(HDD,network interfaces,etc) and another team to check/debug the protocol; not sure how many noc teams have both skill sets at the moment.
  4. ORR and add-path are tools you could use together or independently to create a few clusters of RRs on a larger network to simulate what is fully distributed. ORR is available from both Juniper and Cisco, albeit via the most bleeding edge software and Juniper only supports IS-IS today.

    On my network for instance I have 20+ sites today in a full mesh, and they act as RRs for downstream routers. I could likely replace all of that with 3 pairs of servers running multiple vRR VMs for different domains along with ORR for true best path selection.
  5. 179th BGP tagged post... I chuckled.
  6. This makes total sense. Running BGP RR in some x86 servers brings in a lot of benefits, including a separation of control plane and forwarding plane, cost reduction and flexibility. Obviously, they still should sit in a spot of the network where it makes sense. In addition, this gives agility and fault domain containment to operators who can now easily deploy multiple vRR for different address families, such as IPv4, VPNv4, MVPN, IPv6, etc.
Add comment
Sidebar