Using EVPN in Very Small Data Center Fabrics

I had an interesting “how do you build a small fabric without throwing every technology in the mix” discussion with Nicola Modena and mentioned that I don’t see a reason to use EVPN in fabrics with just a few switches. He disagreed and gave me a few good scenarios where EVPN might be handy. Before discussing them let’s establish a baseline.

The Setup

Assume you’re building two small data center fabrics (small because you have only a few hundred VMs and two because of redundancy and IT auditors).

You don’t need more than two switches in each data center, and because the bandwidth requirements are usually reasonably small and long-distance links are expensive, you could connect them like this:

Furthermore (because we’re dealing with an Enterprise design), let’s assume you have to support end-to-end layer-2 connectivity across this fabric.

As you know, I would never recommend a customer to do that, but sometimes you have more important battles to fight in a limited amount of time.

To keep things simple, we’ll assume everyone involved wants to have the simplest possible fabric design, so we won’t be using port channels between servers and switches but rely on active/standby links or whatever other load distribution mechanism the servers (or hypervisors) provide.

Let’s Do It the Old Way First

In my simplistic design, I’d use VXLAN encapsulation with ingress replication based on static flood lists:

  • I’d use pure IP routing within the fabric;
  • Switches would use VXLAN encapsulation to transport Ethernet traffic across IP fabric;
  • Ingress replication would be used instead of IP multicast to implement VXLAN flooding of BUM frames;
  • Dynamic MAC learning relying on BUM flooding would be used to populate MAC-to-VTEP tables;
  • Instead of using a control-plane protocol to build the list of remote VTEPs to which the traffic needs to be flooded, I’d use a static list of remote VTEPs configured on every switch.

Admittedly, you’d have to tweak the configuration every time you add a switch to my fabric design. However, it would be trivial1 to write a simple Ansible playbook to generate the switch configuration. If you want to learn how to do that, check out the Ansible for Networking Engineers online course.

Now Add EVPN to the Mix

Instead of manually configuring VTEP flood lists, you could use an EVPN control plane between the four switches. I would use a full mesh of IBGP sessions to keep the design as simple as possible (remembering I’d have to go for route reflectors if the fabric grows).

EVPN would automatically build the flood lists (removing the need for manual configuration) and propagate customer MAC addresses using BGP.

So far, there’s very little advantage of using EVPN, and a disadvantage of using a pretty complex piece of technology; things get a bit more interesting when you want to implement MLAG or layer-3 forwarding.

EVPN does reduce configuration complexity in fast-growing environments that can’t spell automation. On some devices, you have to configure static flood lists per VNI, whereas you only have to configure EVPN IBGP neighbors once.

Want to Know More?

If you’re building a small fabric with just a few switches, you might get a good design following recipes you find on the Internet… but when you start building larger fabrics, your designs will be better if you understand the underlying technology and its tradeoffs. You’ll discover those in the Data Center webinars and Designing and Building Data Center Fabrics self-paced online course.

Some of you will have to design more than just the transport fabric. You’ll find all the material from the data center fabrics course and more details on compute, virtualization, storage, and network services in the Building Next-Generation Data Center instructor-led online course.

Finally, to learn more about EVPN technology, check out the EVPN Technical Deep Dive webinar.


  1. For some not-so-small value of “trivial” ;) ↩︎

5 comments:

  1. I apologize in advance for the lengthy comment.
    It seems that the post is not complete. But since I know you both (I mean Nicola and you, two of the best networkers I ever met in my life), and you both know my ideas about VXLAN and EVPN that we discussed recently in private e-mails, I want repeat my toughts here.
    I think that EVPN is an excellent standard for those who love Layer 2 (L2) services, we may say that it is an evolution of the implementation of the VPLS service, which addresses some limits in the original standard (RFCs 4761 and 4762). But as you both know I have a mental problem, I cannot understand the usefulness of L2 services (I am aware that this is a my personal obsession). I think that the preference for L2 services has its origin in the enterprise world (pushed by well known $vendors) while ISPs tend to work at Layer 3 (L3) only, even if they are urged to offer L2 services by their customers.
    And anyway, if someone loves them (I mean L2 services), the VXLAN solution with EVPN control plane is probably the best, even in small Data Center. However, let's try to analyze the problem a little bit deeper. Multi-tenancy can be implemented in various ways:
    - Traditional L2: using VLAN tags (global).
    - Advanced L2: using VXLAN VNI (global). VXLAN has the problem of the control plane. Leaving aside the multicast routing (apparently not much loved in Data Center environment and with serious scalability problems in large Data Center), and Ingress replication (also with serious scalability problems), someone has thought (correctly), we use EVPN, using BGP for the MAC remote learning (plus other excellent features such as multi-homed access management, MAC mass withdraw, aliasing, etc.).
    - L3: L3VPN based on the BGP/MPLS model. Here too, multi-tenancy occurs through a tag (MPLS service label, local) and IP prefixes exchanged via BGP.
    The last two approaches are conceptually very similar, so much that this reminds me of the famous RFC 1925, sect. 11.
    At last, a personal thought of mine that goes beyond the arguments discussed in your post. But do we really need VXLAN to implement multi-tenancy ? Would not it be enough to use a classic L3VPN model (perhaps without LDP if DC switches do not support it) ? And if 20 bits for multi-tenancy are considered too few against the 24 bits of the VNI VXLAN, you can use the new RFC 8277, which gives you the possibility to add more than one MPLS labels to BGP-LU advertisements. So, using for example two labels, 40 bits would be available for multi-tenancy, and using MPLSoUDP, you would have an identical encapsulation (in length) to VXLAN ! But networkers, as you know, really like RFC 1925, sect. 11 ...
  2. in this 4 Switch scenario with Ethernet or VXLAN F&L back-2-back, how do we solve the all-active First-Hop Gateway requirement? Adding filters, different passwords, groups or similar left and right with disabling “duplicate detection” or even break VM mobility (if reallity) in such traditional networks (Ethernet and VXLAN use both F&L)?
    I’m interested in the proposal as with VXLAN EVPN and the Distributed IP Anycast Gateway (IRB) I have a integrated solution that allows me to do so. Maybe this could be one reason to use EVPN in this tiny setup case.
    You see, I don’t say it is a simple solution using EVPN but a potential advantage you can gain here.
    Looking forward to your opinion :-)
    -Lukas
    Replies
    1. Well, several vendors allow you to configure anycast IP gateways without EVPN ;) and they work just fine (wrote a series of blog posts on that a long while ago).

      Not sure what exactly would or would not work on what ASIC though - as you know the vendor everyone uses isn't exactly forthcoming with limitations of their chipsets.
  3. Quick question
    Is it possible to run EVPN over L2 PBB without MPLS?
    I don’t see any good reason why not..
  4. Theoretically this is an elegant solution. Operationally, how do I support this implementation without consulting services? A small datacenter implementation is usually attached to a small business operation which usually cannot afford salaries of network engineers supporting the latest and greatest whiz-bang methodology. So, give me a business case for implementing complexity I can't support with entry to mid-level network engineers.
    Replies
    1. The customer where I recommended this design is a large multinational. Contrary to what some vendors would love you to believe, many large enterprises don't need more than two switches per data center. I've been telling this for years, but of course nobody ever listens.

      And yes, they used a small consulting company for the implementation phase. They are smart enough to know when it makes sense to do things on their own and when to bring in professional help. Wish more people would be this realistic and mature.
  5. Hello Ivan,

    Could you made a update for this 4 switch EVPN+VXLAN configuration ? As any change on ASIC and NOS today would require change in this config ? And how we can insert FW services in this design ?

    Which you a Happy new year,

    Francois

    Replies
    1. > Could you made a update for this 4 switch EVPN+VXLAN configuration as any change on ASIC and NOS today would require change in this config?

      I don't get this part of your comment. ASIC changes often do not require configuration changes (unless you're using ASIC-specific features), while switching NOS almost always results in an interesting migration process.

      > And how we can insert FW services in this design?

      There's no magic in service insertion. You could use PBR or VLAN stitching. See https://my.ipspace.net/bin/list?id=SDNUseCases#SERVICES for more details.

      VLAN stitching can be implemented with statically-configured VXLAN or with EVPN -- EVPN details in https://my.ipspace.net/bin/list?id=EVPN#L47SVC

Add comment
Sidebar