Is Cisco ACI Too Different?
A friend of mine involved in multiple Cisco ACI installations sent me this comment on their tenant connectivity model:
I’m a bit allergic to ACI. The abstraction is mis-aligned with familiar configurations, in particular contracts being independent of and over-riding routing, tenants, etc. You can really make a mess with that, and I’ve seen some! One needs to impose some structure, naming conventions…, and most people don’t seem to get that done.
As I noticed in the NSX-or-ACI webinar, it’s interesting how NSX decided to stay with the familiar VLAN/routing/filtering paradigm (more details), whereas the designers of Cisco ACI decided to go down a totally different path.
There’s nothing wrong with being different, and Cisco ACI connectivity model might be an ideal abstraction, and just what’s needed if you’ve never seen IP networking before… but it’s hard to change the mindset of everyone who ever heard about TCP/IP, or support the gazillion of broken enterprise applications that rely on dirty VLAN-based tricks like shared MAC- or IP addresses. Not surprisingly, many Cisco ACI installations turn into glorified (and overly complex) VLAN managers.
But of course it gets worse… enterprise environments expect GUI-based configuration, and while that encourages the continued creation of bespoke snowflake environments, it has another drawback:
Documentation would help. But people think the GUI is the documentation.
Sometimes it gets as ridiculous as a networking engineer writing an automation script to collect Cisco ACI tenant configurations to help identify configuration parameter creep in supposedly-identical tenant deployments.
Is there a way out of this morass? Infrastructure-as-code and automation obviously help, and you can find several Cisco ACI deployment automation examples in our Network Automation Solutions showcase… but it turns out that the moment you start automating your deployments, you might not need Cisco ACI anymore. Back to my friend:
I just did a manual 2-spine/10-leaf VXLAN deployment with multi-site connectivity, and clearly automation is the better way to go, primarily for accuracy of configs. Customer likes Ansible for automation, and now that the fabric is built, adding VLANs is pretty easy templating.
But even if you decide to go down this path, you’ll quickly face another challenge: should you build your own solution (aka “invest in premium people”), in which case you might find our network automation course useful, or buy a premium product from an umbrella orchestration vendor and spend the rest of your life fine-tuning it to meet your “unique” needs. As always, once you start looking into the details, there is no easy answer.
I like Application Centric Infrastructure, for this "application centric" approach... and you have to go along this path to find it very useful. I consider EPGs and contracts, just like security groups and rules inside AWS, there are very loosely tied to network construct (IP address, subnetting/routing, ...). When designing an ACI Fabric, we try to go along this philopsophy and make decision along this path.
Yes, we took several hours to find naming convention for every type of objects in ACI because it is object-oriented.
And, to be really transparent, when starting with ACI, you have to think about automation at the beginning (even if you don't automate at first), it has to come quickly, in order to leverage the abstract ACI offers. That's where API First is a real thing and ACI gives you all the tool (API inspector) to go API.
We have now architects and developpers that can define their architectures logically without any IP-related information, and provision all EPGs and contracts (and ...) with API calls, from a JSON/YAML/CSV/xls description file.
And so, I would'nt recommand ACI if you want to stay in a "network-centric" mode, I agree.
1) "One needs to impose some structure, naming conventions…, and most people don’t seem to get that done" Well that's for me a great example why the client and VARs should start automate to give their deployments and DCs some structure and standardize it. You can make a mess out of any technology in the world if you can't design it properly that's a fact.
2) "but it turns out that the moment you start automating your deployments, you might not need Cisco ACI anymore" - that is very subjective. Yes you can automate your 3 DCs with 6 spines and 12 leafs start extending your VXLANs without ACI but that's something you can do if you have team of at least 2-3 developers who has skills for it. Not typical enterprise which has team of 2 engineers who are sitting most of the time on P1 cases and at the same time helping their boss with fixing his Outlook.
3) Great benefit which you didn't mention of APIC is that this is your single source of truth in your DC. Yes you can start automating your DC without APIC and using just a standard NXOS APIs but that's not the only point why to run ACI. APIC gives you much more than just API for configuring your DC but also provide you realtime metrics from your infrastructure and provides APIs for your telemetry etc. Yes APIC APIs look like they have been built in 2000 (which it maybe actually truth) and they have many gaps but that would be another talk....NXOS APIs are in my opinion even worst but that's just a thing of taste....
I like ACI and I have been working on it for years so far. We get used to the constructs and different concepts like BD, EPG, ANP and contracts. We have seen the potential of automation and it’s very obvious ( especially when u want to allow 173 vlans on certain VPC)
Yet we never been really able to transform into pure application centric architecture. Actually we never been there and we had discussion about that with Security guys and nothing moved forward. It’s clear it depends on the environment and despite what cisco claims but I cannot imagine replacing firewall rules with contracts only.
Regarding troubleshooting , it’s very difficult to do native command line troubleshooting without TAC support. Amount of hidden commands are huge and the internals are so taboo on practical level.
There is a lot of myths regarding ACI and a lot of people are adding to it especially those with bias to other vendors. However when I e plotrd different vendors i find them facing same challenges and might be offering same results like ACI.
In my opinion i think SDN is just in the early phases and it’s still open for development
I have been involved in both setting up an ACI multi-pod fabric, and running an Arista EVPN based network, and either way you're gonna need to do your own automation, and have your own team of automation staff.
To my mind ACI is simply overly complicated, and although it does automate certain things for you out of the box, such as building the underlay and adding new leafs, the learning curve to get there is simply way too steep.
Yes, naming conventions is key, but if you've never deployed an ACI fabric before, how on earth are you gonna know which conventions to use? And once you've typed in a name in ACI, you can't change it! Already at this point you need to automate stuff, to roll back your initial naming mistakes, and fix them.
Yes, linking routing and contracts is a big mistake by the designers. Routing is control-plane, and should work the second you enable it, irrespective of contracts. Contracts is security policy, and just like firewalls, Azure NSGs etc., they should have no effect on routing.
I haven't met anyone yet, who has done an application-centric implementation of ACI. They have all been network-centric, and the amount of steps you need to perform in the ACI GUI to perform even the simple operation of emulating a simple VLAN with Bridge Domain, EPG, Contracts, and linking it to a VRF is such, that you cannot perform it manually without making mistakes, and you simply NEED to automate that, to get it right.
I recognize the fact, that if you wanna deploy an EVPN fabric such as Arista's using CloudVision, you also need to perform a fair amount of programming, and create your on "source of truth". To be efficient you need to automate that too, but you could start out with simple well known CLI based templates and automate as you grow. It's so much easier to troubleshoot, CloudVision provides magnificent telemetrics, and you can apply the same technology both in the datacenter and in the campus, rather than having to struggle with the incompatible plugs of ACI and SDA.