Running vSphere on Cisco ACI? Think Twice…

When Cisco ACI was launched it promised to do everything you need (plus much more, and in multi-hypervisor environment). It was quickly obvious that you can’t do all that on ToR switches, and need control of the virtual switch (the real network edge) to get the job done.

Updates:
  • 2017-03-17: It was not clear that the API I was referring to is the 3rd party vSwitch data plane API.
  • 2017-03-26: VMware announced what I described here a week later. So much for "they will never dare to do it" wishful thinking from $NetVendor engineers.

No problem, there’s Cisco Application Virtual Switch. Well, its deployment on vSphere was not supported by VMware (in case you care), and it seems VMware might throw another spanner in the works.

Looks like even Nexus 1000V is currently not working on vSphere 6.5, and based on some VMware training material I’ve seen third-party distributed switches might not be supported at all in vSphere 6.5U1. Whether that means “don’t call us” or “they won’t work because the API will be gone” wasn’t exactly clear (so we’ll have to wait till 6.5U1 comes out), but with vDS being pretty close to Nexus 1000V feature-wise (and NSX competing with ACI) I would expect VMware to kill the internal vDS data/control plane API used by 3rd-party virtual switches like they killed the dvFilter API.

That would degrade Cisco ACI used in vSphere environments into a smarter L2+L3 data center fabric. Is that worth the additional complexity you get with ACI? It depends… and we’ll discuss some aspects of that in the VMware NSX, Cisco ACI or standard-based EVPN workshop in mid-May.

Note to potential commenters: you cannot implement EPGs and all the other fun stuff on granularity finer than VLANs unless you control the virtual switch. The proof is left as an exercise for the reader.

Update 2017-03-21: As is often the case I stand corrected (thanks to g9ais and Nillo) - you can use PVLAN to pull all traffic out of the hypervisor to the ToR switch, and process it there. I still think there are things that can go wrong with that approach with just the right mix of flooded traffic, so I don't think it's functionally equivalent to having full control of the virtual switch. Can't figure out at the moment whether that's relevant or not - any comment would be highly welcome.

46 comments:

  1. There is a way of out this. Have your nexus 1k as a VM on every host and user VMs, each in their own locally unique VLAN, connected to the nexus VM via these p2p VLANs. There's one SDN vendor that does exactly that.
    Replies
    1. Sounds like a poor design decision. If you would get into a NSX (DLR in kernel) vs deploy 1 VM for every host and hairpin all your traffic through that VM discussion, its obvious what would be the prefered solution. :)
    2. Obviously there will be some performance penalty. That is exactly what Vmware are trying to do. No product should fit in better than NSX.
    3. That's what Nicira was doing and what Nuage is probably still doing. The "only" problem: traversing userland kills performance. Nuage improved it by using SR-IOV on the external VM connection, but still.

      While it's easy to make fun of how limited VMware virtual switch is, its performance was always stellar (at least compared to alternatives).

      Not to mention that this becomes a nightmare to manage, more so as you a different VLAN on every inter-VM connection (or traffic gets bridged before it hits _your_ virtual switch)
    4. I wouldn't say it "kills" performance. It's a lot less detrimental then traversing a chained NFV(e.g. firewall). Nuage were doing it because they didn't have a strong hardware story. Cisco, on the other hand, can ditch the VS altogether and extended these p2p vlans to the hardware switches, similar to what they're doing with their Neutron plugin. This way they can keep the per-VM granularity with hardware performance. The only price they pay in this case is the ability to innovate and add new features. But like you said, nothing that can't be fixed with a bit of NAT, PBR and VXLAN
    5. "It's a lot less detrimental then traversing a chained NFV(e.g. firewall)." << It's exactly the same thing.

      "Cisco, on the other hand, can ditch the VS altogether and extended these p2p vlans to the hardware switches" << Don't forget that you can't turn off vSwitch in ESXi or reconfigure it at will like you can in Linux.
    6. "Don't forget that you can't turn off vSwitch in ESXi or reconfigure it at will like you can in Linux" << Nothing special is needed in this case. SDN controller only needs basic integration with vCenter to make 2 API calls - 1) create an sPG 2) Attach a VM to this sPG.
    7. To recap:
      * One port group (or more) per EPG
      * Separate VLAN for every EPG
      * Private VLANs if you don't want VMs within EPG to communicate
      * SDN controller creating vSphere port groups and attaching VMs to port groups behind the scenes.

      I'm sure it will work and scale amazingly well, and everyone will be delighted (including vSphere admins), and I'm only imagining problems where there are none...
  2. "Some VMware training material" - could you be more specific?
  3. Note that Cisco states that Microsegmentation in ACI can be achieved also when running DVS, through creative use of private vlans on DVS + Mac filters in Leaf switches + Proxy ARP. Would this be a case of just because you can, doesn't mean you should?
    Replies
    1. Well true at high level but the implementation detail is more complicated, which allows for usage of uSeg at scale.
      Also to be honest, VMM integration is great, but far from mandatory. You can achieve exactly the same level of integration by putting some automation in place and use physical domains...no functional difference in the end. So who cares except VMware fanboys who are anyway too emotional to limit DC design to real requirements?
    2. s/VMware fanboys/Cisco fanboys/
      Seriously, why are Cisco staff so petty?
  4. If VMWare eliminated the API they would burn bridges with existing customers who are using VSphere + ACI. Not just small ones, I'm referring to large corporations such as banks.

    They would also screw other partners such as Bigswitch who rely on the same APIs.

    If VMWare would want to do such a thing and push the world even closer to using open source hypervisors and containers, so be it.
  5. Times change, Nicolas...

    http://static-void.io/vmware-nsx-use-case-simplifying-disaster-recovery-part-1/

    Before joining Cisco, you have been a VMware / NSX fanboy as well ;-)
    Replies
    1. Well not fanboy, but I still do like NSX and all the VMware products in general and still working with them on a daily basis. Path to VCDX taught me how to tackle global design exercises...which has nothing to do with products, but requirements. Sometimes people tend to forget this and this is a pity.
    2. @anonymous So, what's your insinuation? This is still hopefully a technology page not a high-school debate gotcha!
      He wrote a blog on how to use NSX for a particular usecase. Is that bad? Atleast, he can say he understands both products.
  6. Other than speculating and pointing to other blogs hinting at technical difficulties you haven't cared to verify yourself, what actual facts can you point us all to? If VMware discontinues support for the vDS API, how do you suggest their own tools provision vDS?
    Replies
    1. There's the management API and then there's the "3rd party data plane API" that switches like Nexus 1000V use to integrate with vDS framework within vSphere. Based on the tone of your comment I would expect you to know that.

      Also, based on the fact you chose to remain anonymous my end of this discussion stops right here.
  7. Ivan, I believe you are miss informed. I am not talking specifically about the vSphere API that AVS uses, and the constant rumours about it, but more about ACI in general.

    It is true that VMware is becoming a company where their technology is increasingly vertically integrated and eventually they want that if you use their hypervisor you have to use their SDN solution and their orchestration system and their VDI and ... I think the market will steer things otherwise … but we shall see! 😃

    I don’t know how much of an opportunity you had to have hands-on with ACI recently. I’d love to spend time showing you how some of the things we do with ACI work. Meanwhile I provide here my respectful feedback.

    I definitely disagree with some comments that you made. Some are in fact open for debate, for instance:

    “You can’t do all that on ToR switches, and need control of the virtual switch”.

    I wouldd say you can do a lot of what you need to do for server networking on a modern ToR. And yet you are right, and you do need a virtual switch. That is clear. How much functionality you put on one vs. another is a subject for debate with pros and cons.

    But in an SDN world with programmable APIs, it does not mean you need YOUR virtual switch in order to control it. You just need one virtual switch that you can program. That is all.

    There’s a lot that we can do on AVS that we can do on OVS too. And we do it on OVS too. There’s a lot with do on AVS that we can’t do with VDS. But there’s enough that we can do in VDS so that when combining it with what we do on the ToR we deliver clever things (read more below).

    The below comment on the other hand is imho misinformed:

    “That [running without AVS] would degrade Cisco ACI used in vSphere environments into a smarter L2+L3 data center fabric. Is that worth the additional complexity you get with ACI? "

    First, it is wrong to assume that in a vSphere environment ACI is no more than smarter L2+L3 data center fabric. But even if it was only that … it is a WAY smarter L2+L3 fabric.

    And that leads me to your second phrase. What is up with the “additional complexity you get with ACI?”.

    This bugs me greatly. ACI has a learning curve, no doubt. But we need to understand that the line between “complex” and “different” is crossed by eliminating ignorance.

    Anyone that has done an upgrade on anymore than a handful of switches from any vendor and then conducts a network upgrade (or downgrade) on dozens and dozens of switches under APIC control will see how it becomes much simpler.

    Configuring a new network service involving L2 and L3 across dozens and dozens of switches on multiple data centers is incredibly simpler. Presenting those new networks to multiple vCenters? Piece of cake. Finding a VM on the fabric querying for the VM name? … done. Reverting the creation on the previously created network service is incredibly simpler too. I mean … compared to traditional networking … let me highlight: INCREDIBLY simpler.

    Changing your routing policies to announce or not specific subnets from your fabric, or making a change in order to upgrade storm control policies to hundreds or thousands of ports, … or - again - reverting any of those changes, becomes real simple. Particularly when you think how you were doing it on NX-OS or on any other vendor’s box-by-box configuration system.

    And the truth is that APIC accomplishes all of that, and more, with a very elegant architecture based on distributed intelligence in the fabric combined with a centralised policy and management plane on a scale-out controller cluster.

    Other vendors require combining six different VM performing three different functions just to achieve a distributed default gateway that is only available to workloads running on a single-vendor hypervisor. Now that’s complex, regardless of how well you know the solution.

    Full response with details about how we do uSeg with VDS here:
    http://nillosmind.blogspot.com/2017/03/a-response-to-running-vsphere-on-cisco.html
    Replies
    1. Thanks for an extensive reply. I'll stick to the technical details, in particular the PVLAN part. While I admire the ingenuity of the idea, the deviation from IEEE 802.1 forwarding paradigm worries me. For example, if you have to flood traffic from ToR switch to an EPG member, everyone in the same PVLAN will get the traffic even if they're not in the same EPG.
    2. There is no deviation Ivan. There is nothing magical here. Again, I extend the offer to spend time with you actually testing this, so we can move on from theoretical discussion. It is the best way to stick to the technical details and remove assumptions.
    3. I'm not sure I understand this point, this is the principle of a bridge-domain, ie. flooding boundary, PVLAN or not, it's always been the case. You can also optionally flood in EPG.
      In addition, although this is PVLAN in the VDS, it's not PVLAN on the ToR. This means:
      - normal ACI BD semantics are applied, regardless of whether EPG is enabled for uSeg or not (understand PVLAN pushed to port-group). All EPGs inside the same bridge-domain receive the same PVLAN ids. (no burning of VLANs). Then secondary isolated VLAN in ACI translates into a reserved pcTag that is set with deny({src,dst}=isolated encap). no PVLAN implementation there at all...
      - MAC-based classification is pushed on ToR for the base EPG
      - List is updated upon VM-based attributes defined by the user to re-classify EP in to specific uSeg EPGs.
      - You need to specify a BD for uSeg EPG
      - Flooding will happen THERE.
    4. I defy anyone to look at the diagram contained in the ACI Virtualization Guide on this topic and say this is a supportable or scalable solution...

      http://www.cisco.com/c/dam/en/us/td/i/500001-600000/500001-510000/500001-501000/500655.jpg
    5. Hey Anonymous :-)
      Apparently you didn't understand the principle. This is one (Pimary ,Secondary) pair per Bridge Domain (BD), not per ACI VLAN encap. You can have many EPGs per BD...so scale is really not the issue.
      But as you've chosen to stay anonymous, I guess the discussion doesn't need to go any further anyway.
    6. There is always a workaround or right architecture when you deploy these solutions specially in enterprise environments and not in lab. Forcing traffic via PVLAN (i cant believe we still using this technology), proxy ARP, inconsistent state between what you see on vpshere vs ACI and not to mention sub optimal traffic forwarding and inspection. Does make sense to me just because you need to have a tick mark
  8. I do work for Cisco in the ACI team. However, I used to work at Oracle. This whole topic is a massive deja vu.

    ".. it seems VMware might throw another spanner in the works."
    Funny how Vmware played the open/choice card vis-a-vis Oracle software running on Vmware hypervisor environment but now are making Oracle's arguments against Cisco ;-). Hypocrisy much?
    Let me refresh your mind:

    Oracle support policy>>
    https://mikedietrichde.com/2011/01/17/is-oracle-certified-to-run-on-vmware/
    Did you know about what Vmware says to Oracle that has a similar approach when someone runs Oracle
    software/databases in an Vmware ESXi environment?
    >>https://www.vmware.com/support/policies/oracle-support.html

    So, coming back to AVS/N1KV being turned off...

    It would be a really interesting conversation that this vendor would have with customers who have deployed AVS and N1KV if they decided to turn APIs off.

    Ivan (@ioshints), would be nice if you spent some time with someone who actually breathes ACI and/or validate it in detail yourself.
    You make some comments here that seem to be assumptions like "would degrade Cisco ACI" that would be nice to validate.

    Further, Ivan(@ioshints), you make it sound like closing down APIs is a good thing or an inevitable fact of life.
    Replies
    1. I agree the whole thing is a massive deja vu all over again. Unfortunately that's how the industry we work in works. See also RFC 1925.

      As for "It would be a really interesting conversation that this vendor would have with customers who have deployed AVS and N1KV if they decided to turn APIs off." - they did that with dvFilter API which was used by way more products. There must have been heated conversations. I haven't heard about them from any of my customers.

      "would be nice if you spent some time with someone who actually breathes ACI" << I did ;)

      "you make it sound like closing down APIs is a good thing" << where did I do that?

      "or an inevitable fact of life." << That's probably true. BTW, what happened to OnePK API?
    2. This comment has been removed by the author.
  9. As has been mentioned here, ACI has too many moving parts. I recall reading "The Policy Driven Data Center with ACI: Architecture, Concepts, and Methodology" a few years back and just going through that text back then I came to the same conclusion.
    Replies
    1. Have to disagree with that: Having ACI as your fabric effectively gives you ONE moving piece, which in turn enables you to consolidate the hundreds of moving pieces we've been dealing with for decades.
  10. Ivan, I've been a reader of your blogs for many years and appreciate your contents.

    I was one of the founders of ACI and very disappointing to see how it is represented here. I'll stay away from NSX comparison but want to highlight value prop of ACI.

    In datacenter, you have VMs, Bare metal (storage and servers) and evolving into containers in a VM, or containers nativly in BM.

    Now think about managing networking/vSwitches in vSphere, Hyper-V, KVM for your VMs - may be you have one type or you are considering to deploy another vendor, physical switches for bare metals, container networking, and then vxlan-vlan gatewatey for BM-<>VM connectivity etc.

    Not only managing complexity but then operation, troubleshooting and policy consistency. And then making sure it all works at scale and hopefully doesn't change when new technology like container is to be deployed.

    Please read one of my blogs:

    http://blogs.cisco.com/datacenter/aci-the-sdn-purpose-built-for-data-center-operations-and-automation

    At the end, customer validate the technology and I'm very proud to hear from customers that the ACI is game changer.

    I'll close with one paragraph from my blog:

    "ACI was architected to provide a unified network independent of the type of workload – bare metal, virtual machine, or container. At the end of the day, workloads are IP endpoints and two IP end points want to connect. They need load balancing, security, and other services. Their life cycle may be static, may need migration, or simply a quick start/stop. ACI handles it all."
    Replies
    1. Right on. Ronak
    2. Praveen,

      Thanks for the comment. Could we please focus on what I wrote and not what Cisco's engineers read between the lines, or what marketing messages they might like to propagate?

      Please note that there are no generic comments on ACI applicability, or how your customers might perceive it in this blog post, it deals with the potential impact of not having AVS in vSphere anymore.

      Whoever claims that you can get the exact same functionality in vSphere environment without AVS is mistaken, as I'll explain in a follow-up blog post.

      Finally, you might want to watch the Networking Field Day videos (including the latest ones from CLEUR) to get a wider perspective on my ACI views.

      Kind regards,
      Ivan
    3. Ivan,

      First, AVS is an optional component of ACI. Let me give an example of using vDS and implementing micro-segmentation.

      And before I do that, let me ask you, why micro-segmentation is only for VMware Vms, why bare metals and containers, and other hypervisors don't deserve that treatment? That's what exactly customers get when they deploy ACI, a consistent policy, a consistent micro-segmentation, across all workloads.

      EPG is micro-segments in ACI. Define any number of them and put them under same BD (a segment). Put contract/policy across them. When using VMware, each EPG will be pushed as separate port-groups and policy enforced by EPG. At the same time flood semantic of BD is maintained across. Same thing works with bare metal, containers, Hyper-V etc.

      Here is my blog on micro-seg:
      http://blogs.cisco.com/datacenter/microsegmentation

      There are other ways to achieve that in ACI using private vlans but let's leave it for some other time.

      If you say "That would degrade Cisco ACI used in vSphere environments into a smarter L2+L3 data center fabric." Citing this without deep dive is what I'm not happy about.

      I'm not going to compare feature by feature here; please do a hands on or talk to customers who deployed this.

      BTW, I'm no longer in Cisco and now I only get news like everybody else does. Here is what I saw yesterday

      https://www.linkedin.com/feed/update/urn:li:activity:6250791574444347392/

      Cheers
      Praveen
    4. Direct link on DataVita datacenter built on ACI.

      http://blogs.cisco.com/datacenter/never-better-time-for-cisco-aci-enabled-data-center-providers
    5. Ivan,

      you can be sure that it's not because VMware is closing APIs that we won't propose any alternative architecture to the AVS.
      It's clearly not the "death" of AVS, and customers currently running AVS can be guaranteed that we'll do everything required to maintain their commitment to ACI and add AVS "services" to their environment post vsphere 6.5u1.
  11. sadly when big vendor do these changes/movement on direction the only loss is for the small customer...it was the same when cisco said to through all nexus 7k/5k and buy the nexus 9k if you want automation...it's just the customer that invest money has to deal with these changes and the loss :(
  12. "Update 2017-03-21: As is often the case I stand corrected (thanks to g9ais and Nillo) - you can use PVLAN to pull all traffic out of the hypervisor to the ToR switch, and process it there. I still think there are things that can go wrong with that approach with just the right mix of flooded traffic, so I don't think it's functionally equivalent to having full control of the virtual switch. Can't figure out at the moment whether that's relevant or not - any comment would be highly welcome."

    The functionality equivalent to having full control of the virtual switch is definitely not the same. Loosing a host-based local switching and routing is one of the examples. But in this case as Nillo mentioned the traffic optimization here is not the big issue from a practical standpoint. What is a bigger disadvantage from the real-life scenario is a lack of service insertion into the L2 path or intra-L3 subnet path. In theory ACI can use usegments and use contracts between them but still no service insertion into the L2 path. What’s more usegments with so many contracts do not scale even on the next gen 93180YC-EX. In comparison NSX do this natively with many security vendors. Without the effective micro segmentation ACI cannot offer spoofguard to protect against man in the middle attacks or IP address spoofing. The best switch on the market will no see flows which are switched locally on the host. How the security can be fully provided if there is no full visibility of applications running on hosts? So security I would say is the biggest concern in terms of the solution which is hw-switch based not vswitch based.

    To anticipate some arguments:
    Yes, NSX-v is for vSphere and NSX-T is for multihypervisor.
    Yes, I also recommend hardware based solutions where critical apps are on the bare metal.
    Yes, Visibility of the underlay is not an issue.
    Yes, Virtual firewalls plays a vital role to protect apps in the virtual environment.
    Yes, Both NSX and ACI have its own pros and cons. :)
  13. Most of our unfortunate coupling of network technologies has been in pursuit of performance. E.g., VMware will talk your ear off about the value of NSX local switching/routing/FW (though they'll go curiously quiet when working on NSX edge designs).

    But it seems that with VMXNET3 driver support, DirectPath I/O, SR-IOV, and VXLAN+TCP offload support, we've got enough network performance that maybe it's time to shift all the vSwitch bloat into a VM.

    vArmour gets by just fine completely with VM-based switching -- they use one VM on each host for traffic interception and redirection. Some traffic is redirected w/VXLAN to the main firewall VMs. Other traffic is forwarded directly to the uplinks or other VMs on the host. Their approach is totally agnostic to the hypervisor vSwitch, so it works with every hypervisor and cloud environment. It doesn't have the raw speed of heavily-coupled hardware/hypervisor forwarding, but it's close enough that the only ones who will raise a stink about it are the sales engineers.

    Cisco did the same thing with Virtual Topology System (VTS) -- it began as a totally decoupled software VXLAN engine with distributed L2 control plane -- perfect for extending L2 between different hypervisors located anywhere in the world. They've since spun VTS into something slightly different, but maybe Cisco should bring back and reemphasize its hypervisor-agnostic switching capabilities.
    Replies
    1. Hi Craig. I work for vArmour and wanted to correct you on one point. You're right about vArmour for the most part, but I think you're working with somewhat outdated information. There is no forwarding or offloading to separate FW VMs. This was the behavior of the system in a previous generation. The current VXLAN implementation is for control plane only. Everything inside the hypervisor gets L7 inspected and forwarded on the normal switching path. This is true regardless of whether it's deployed in ACI or non-ACI mode.

      And you're right about the performance. With some of Intel's latest libraries, you can make userspace VMs run pretty fast from an I/O perspective. 8+ Gbps performance per vCPU with application ID and verbose logging with 1500 byte MTU.
  14. One my favourite things in Ivan´s posts is the discussion that spontaneously develops in the comments, call it the right topic or hitting the nerve...
    Im a true ACI and NSX fan, also VCIX-NV, but I really don´t like that VMware closed the APIs for 3rd party integration, its a bit coward-ish facing the competitors technology that way. a real potential of SDN is in how you handle the Hypervisor Virtual Switch, and both Cisco and Nuage need to completely change the strategy.
    My question is: Does anyone know if VMware just stopped supporting the solutions with a 3rd party virtual switch, or are the APIs closed all together? Could we, as a System Integrator, support vCenter+AVS?
    Replies
    1. no it s closed the API and will focus the vmware engineer to work on their products like the DVS API etc..
  15. Hi!Your way of writing blog is so nice and interesting one too.Thank you.
    บาคาร่า
    gclub จีคลับ
    gclub casino
  16. Hi.
    interesting blog and comments.

    As we're 9 months down the line, what is the situation now?

    AVS will not work at all in vSphere 6.5 ?
    Are Cisco working on a VM-based v-switch to replace AVS?
    Raw speed is not important to me, I'd rather have the functionality of an ACI-integrated vswitch.
    I'm not interested in using VMware's VDS, not in the slightest (I hope VMware are reading this!).
  17. I think i've answered my own question:
    https://supportforums.cisco.com/t5/application-centric/aci-integration-with-vm/td-p/3030792
Add comment
Sidebar