Building Network Automation Solutions
6 week online course starting in September 2017

Cisco ACI – a Stretched Fabric That Actually Works

In mid-February a blog post on Cisco’s web site announced stretched ACI fabric (bonus points for not using marketing grammar but talking about a shipping product). Will it work better than other PowerPoint-based fabrics? You bet!

What’s the Big Deal?

Cisco’s ACI fabric uses distributed (per-switch) control plane with APIC controllers providing fabric configuration and management functionality. In that respect, the ACI fabric is no different from any other routed network, and we know that those work well in distributed environments.

But What about Stretched Subnets?

Even though you can use Cisco ACI to implement stretched subnets (and there was the mandatory mention of VM mobility in Cisco’s documentation), the fabric uses VXLAN-over-IP as the transport protocol, making the underlying transport network rock-solid. You cannot get a L2 forwarding loop in a pure L3 network.

Stretched subnets are as great idea as they ever were (there’s nothing you can do to fix a broken concept), but ACI’s handling of stretched subnets is better than almost anything else out there (including OTV).

ACI uses ARP proxies and anycast gateways on leaf switches, and something equivalent to VRF-based host routing to forward traffic toward IP endpoints. The traffic transported across the fabric is thus mostly unicast IP traffic (admittedly encapsulated in VXLAN envelopes), and we know that IP-based networks got pretty good at handling unicast IP traffic.

But there’s Split Brain Problem

True – and Cisco was quick to admit the problem exists (many vendors try to pretend you don’t have a problem because the redundant links between sites can never fail) and documented the split fabric scenario in their design guidelines:

  • Controller cluster is split, and the minority part of the cluster goes into read-only mode;
  • The fabric continues to forward traffic based on already-established policy rules;
  • Leaf switches can detect new endpoints (assuming they’re covered with existing policy rules) and report them to the spine switches – both isolated fabrics thus continue to operate normally even under edge or core topology changes.

More Information

If you’re interested in data center fabrics, you (RFC 2119) MUST watch the Cisco ACI videos from NFD8 and NFD9, and you SHOULD register for the Data Center Fabrics Update webinar in mid-May.

Disclosure: Cisco Systems was indirectly covering some of the costs of my attendance at the Network Field Day 9 event. More…

21 comments:

  1. I completely share your enthusiasm about Cisco's ACI. I would add that it revolutionized the SDN technology, with -but not limited to - its group-based policy contribution. This GBP is currently being incorporated into 2 major open-source projects - OpenStack & OpenDaylight.
    You'll probably be interested by the Lippis report about ACI: http://lippisreport.com/2015/03/lippis-report-223-an-open-approach-to-network-automation/

    ReplyDelete
  2. I'd like to add another Lippis report specifically focused on ACI: http://lippisreport.com/2014/08/lippis-report-222-cisco-preps-aci-for-general-availability-what-to-expect/

    ReplyDelete
  3. One of the issues with ACI and the current crop of Nexus 9K TOR switches is they cannot do "routed VXLAN" where there are models from other vendors like the Juniper QFX5100 that do. So any kind of inter-VXLAN traffic in the current ACI infrastructure needs to be punted to a GW on a stick setup, which is less than optimal. If your inter-VXLAN GW exists in another datacenter, it could be very messy.

    Unless I'm misunderstanding the current limitations, but the designs I've seen always show another device on a stick acting as a way to route between VXLANs.

    ReplyDelete
    Replies
    1. Are you sure? I thought it was precisely the opposite.

      The QFX5100 uses just the Trident II -- which is "known" (for varying values of "known") to NOT support VXLAN routing. The N9K leverages both the Trident II, as well as custom Cisco silicon -- the latter being specifically able to handle VXLAN routing.

      Delete
    2. Actually you're right I've asked Juniper same question before , the qfx5100 does L2 vxlan gw only however the MXs does vxlan routing.

      Delete
    3. You are right, got confused with a roadmap doc I recently saw. :)

      Delete
    4. Hi Phil,
      That VXLAN routing leaf stick is required in standalone(nxos) mode if you are runing older nxos code. However, with new VXLAN-EVPN/Anycast-GW feature in code 7, you don't need that stick any more.

      Delete
  4. One of the issues with ACI and the current crop of Nexus 9K TOR switches is they cannot do "routed VXLAN." There are models from other vendors like the Juniper QFX5100 that do. So any kind of inter-VXLAN traffic in the current ACI infrastructure needs to be punted to a GW on a stick setup, which is less than optimal. If your inter-VXLAN GW exists in another datacenter, it could be very messy.

    ReplyDelete
  5. Ivan, weren't you against ARP proxies on leaf switch?

    ReplyDelete
    Replies
    1. Help me figure out in what context that was. BTW, it seems ACI is not doing proxy ARP, I was told they're transforming broadcast ARP into unicast ARP.

      Delete
    2. That is correct - turns into a unicast on the ACI fabric since all endpoints' locations are known; therefore no need to flood in the fabric.

      Delete
  6. With the recent implementation of BGP EVPN VXLAN control plane (http://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-733737.html?_ga=1.45902372.1882588252.1422623251):
    - the VM MAC/IP addresses are distributed in the fabric (and even inter-fabric), meaning that all leaf switches can proxy ARP
    - the VM mobility is transparently supported
    - now, "Nexus 9300 switches with the ALE ASIC offer the capability to route VXLAN overlay traffic at the leaf, unlike traditional Broadcom Trident II based platforms, which cannot VXLAN route the packet".

    ReplyDelete
  7. Yep Jean.

    EVPN not supported in ACI mode yet. Expect support for EVPN in Q4

    BTW up until now we really like ACI.

    I just setup our second ACI pod.

    ReplyDelete
    Replies
    1. Forgive my ignorance but what would be the benefit in using EVPN in ACI mode? Don't you get the basic control and data-plane structure of ACI by using EVPN instead anyway (in NX-OS mode)

      Delete
    2. advertising End Point address's between segregated ACI fabrics

      Delete
  8. To many believing the Cisco marketing BS...

    Trident II can route VXLAN packets it requires a re-circulation the same as the Nexus 9300 which uses the NorthStar ASIC for the re-circ.

    ReplyDelete
  9. Yes sure :) that what they do... people just don´t know it.

    ReplyDelete
  10. If you want to see both inbound and outbound routing correction in an Active-Active setup with an ACI stretched Fabric, take a look here - http://www.youtube.com/watch?v=eAQ1ps0AGbY

    This was done with all GA code from Cisco.

    ReplyDelete
  11. is there any new development for Stretched Subnets in Cisco ACI if two DATA center is more then 50 Km

    ReplyDelete
    Replies
    1. Yes, now it's supported up to 800 KMs. See https://www.youtube.com/watch?v=RLkryVvzFM0 and https://www.youtube.com/watch?v=xgxPQNR_42c

      Delete
  12. Ivan, any plans to blog about ACI over Super long distance?
    https://www.youtube.com/watch?v=RLkryVvzFM0

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.