Running vSphere on Cisco ACI? Think Twice…
When Cisco ACI was launched it promised to do everything you need (plus much more, and in multi-hypervisor environment). It was quickly obvious that you can’t do all that on ToR switches, and need control of the virtual switch (the real network edge) to get the job done.
- 2017-03-17: It was not clear that the API I was referring to is the 3rd party vSwitch data plane API.
- 2017-03-26: VMware announced what I described here a week later. So much for "they will never dare to do it" wishful thinking from $NetVendor engineers.
No problem, there’s Cisco Application Virtual Switch. Well, its deployment on vSphere was not supported by VMware (in case you care), and it seems VMware might throw another spanner in the works.
Looks like even Nexus 1000V is currently not working on vSphere 6.5, and based on some VMware training material I’ve seen third-party distributed switches might not be supported at all in vSphere 6.5U1. Whether that means “don’t call us” or “they won’t work because the API will be gone” wasn’t exactly clear (so we’ll have to wait till 6.5U1 comes out), but with vDS being pretty close to Nexus 1000V feature-wise (and NSX competing with ACI) I would expect VMware to kill the internal vDS data/control plane API used by 3rd-party virtual switches like they killed the dvFilter API.
That would degrade Cisco ACI used in vSphere environments into a smarter L2+L3 data center fabric. Is that worth the additional complexity you get with ACI? It depends… and we’ll discuss some aspects of that in the VMware NSX, Cisco ACI or standard-based EVPN workshop in mid-May.
Note to potential commenters: you cannot implement EPGs and all the other fun stuff on granularity finer than VLANs unless you control the virtual switch. The proof is left as an exercise for the reader.
Update 2017-03-21: As is often the case I stand corrected (thanks to g9ais and Nillo) - you can use PVLAN to pull all traffic out of the hypervisor to the ToR switch, and process it there. I still think there are things that can go wrong with that approach with just the right mix of flooded traffic, so I don't think it's functionally equivalent to having full control of the virtual switch. Can't figure out at the moment whether that's relevant or not - any comment would be highly welcome.
While it's easy to make fun of how limited VMware virtual switch is, its performance was always stellar (at least compared to alternatives).
Not to mention that this becomes a nightmare to manage, more so as you a different VLAN on every inter-VM connection (or traffic gets bridged before it hits _your_ virtual switch)
"Cisco, on the other hand, can ditch the VS altogether and extended these p2p vlans to the hardware switches" << Don't forget that you can't turn off vSwitch in ESXi or reconfigure it at will like you can in Linux.
* One port group (or more) per EPG
* Separate VLAN for every EPG
* Private VLANs if you don't want VMs within EPG to communicate
* SDN controller creating vSphere port groups and attaching VMs to port groups behind the scenes.
I'm sure it will work and scale amazingly well, and everyone will be delighted (including vSphere admins), and I'm only imagining problems where there are none...
Also to be honest, VMM integration is great, but far from mandatory. You can achieve exactly the same level of integration by putting some automation in place and use physical domains...no functional difference in the end. So who cares except VMware fanboys who are anyway too emotional to limit DC design to real requirements?
Seriously, why are Cisco staff so petty?
They would also screw other partners such as Bigswitch who rely on the same APIs.
If VMWare would want to do such a thing and push the world even closer to using open source hypervisors and containers, so be it.
http://static-void.io/vmware-nsx-use-case-simplifying-disaster-recovery-part-1/
Before joining Cisco, you have been a VMware / NSX fanboy as well ;-)
He wrote a blog on how to use NSX for a particular usecase. Is that bad? Atleast, he can say he understands both products.
Also, based on the fact you chose to remain anonymous my end of this discussion stops right here.
It is true that VMware is becoming a company where their technology is increasingly vertically integrated and eventually they want that if you use their hypervisor you have to use their SDN solution and their orchestration system and their VDI and ... I think the market will steer things otherwise … but we shall see! 😃
I don’t know how much of an opportunity you had to have hands-on with ACI recently. I’d love to spend time showing you how some of the things we do with ACI work. Meanwhile I provide here my respectful feedback.
I definitely disagree with some comments that you made. Some are in fact open for debate, for instance:
“You can’t do all that on ToR switches, and need control of the virtual switch”.
I wouldd say you can do a lot of what you need to do for server networking on a modern ToR. And yet you are right, and you do need a virtual switch. That is clear. How much functionality you put on one vs. another is a subject for debate with pros and cons.
But in an SDN world with programmable APIs, it does not mean you need YOUR virtual switch in order to control it. You just need one virtual switch that you can program. That is all.
There’s a lot that we can do on AVS that we can do on OVS too. And we do it on OVS too. There’s a lot with do on AVS that we can’t do with VDS. But there’s enough that we can do in VDS so that when combining it with what we do on the ToR we deliver clever things (read more below).
The below comment on the other hand is imho misinformed:
“That [running without AVS] would degrade Cisco ACI used in vSphere environments into a smarter L2+L3 data center fabric. Is that worth the additional complexity you get with ACI? "
First, it is wrong to assume that in a vSphere environment ACI is no more than smarter L2+L3 data center fabric. But even if it was only that … it is a WAY smarter L2+L3 fabric.
And that leads me to your second phrase. What is up with the “additional complexity you get with ACI?”.
This bugs me greatly. ACI has a learning curve, no doubt. But we need to understand that the line between “complex” and “different” is crossed by eliminating ignorance.
Anyone that has done an upgrade on anymore than a handful of switches from any vendor and then conducts a network upgrade (or downgrade) on dozens and dozens of switches under APIC control will see how it becomes much simpler.
Configuring a new network service involving L2 and L3 across dozens and dozens of switches on multiple data centers is incredibly simpler. Presenting those new networks to multiple vCenters? Piece of cake. Finding a VM on the fabric querying for the VM name? … done. Reverting the creation on the previously created network service is incredibly simpler too. I mean … compared to traditional networking … let me highlight: INCREDIBLY simpler.
Changing your routing policies to announce or not specific subnets from your fabric, or making a change in order to upgrade storm control policies to hundreds or thousands of ports, … or - again - reverting any of those changes, becomes real simple. Particularly when you think how you were doing it on NX-OS or on any other vendor’s box-by-box configuration system.
And the truth is that APIC accomplishes all of that, and more, with a very elegant architecture based on distributed intelligence in the fabric combined with a centralised policy and management plane on a scale-out controller cluster.
Other vendors require combining six different VM performing three different functions just to achieve a distributed default gateway that is only available to workloads running on a single-vendor hypervisor. Now that’s complex, regardless of how well you know the solution.
Full response with details about how we do uSeg with VDS here:
http://nillosmind.blogspot.com/2017/03/a-response-to-running-vsphere-on-cisco.html
In addition, although this is PVLAN in the VDS, it's not PVLAN on the ToR. This means:
- normal ACI BD semantics are applied, regardless of whether EPG is enabled for uSeg or not (understand PVLAN pushed to port-group). All EPGs inside the same bridge-domain receive the same PVLAN ids. (no burning of VLANs). Then secondary isolated VLAN in ACI translates into a reserved pcTag that is set with deny({src,dst}=isolated encap). no PVLAN implementation there at all...
- MAC-based classification is pushed on ToR for the base EPG
- List is updated upon VM-based attributes defined by the user to re-classify EP in to specific uSeg EPGs.
- You need to specify a BD for uSeg EPG
- Flooding will happen THERE.
http://www.cisco.com/c/dam/en/us/td/i/500001-600000/500001-510000/500001-501000/500655.jpg
Apparently you didn't understand the principle. This is one (Pimary ,Secondary) pair per Bridge Domain (BD), not per ACI VLAN encap. You can have many EPGs per BD...so scale is really not the issue.
But as you've chosen to stay anonymous, I guess the discussion doesn't need to go any further anyway.
".. it seems VMware might throw another spanner in the works."
Funny how Vmware played the open/choice card vis-a-vis Oracle software running on Vmware hypervisor environment but now are making Oracle's arguments against Cisco ;-). Hypocrisy much?
Let me refresh your mind:
Oracle support policy>>
https://mikedietrichde.com/2011/01/17/is-oracle-certified-to-run-on-vmware/
Did you know about what Vmware says to Oracle that has a similar approach when someone runs Oracle
software/databases in an Vmware ESXi environment?
>>https://www.vmware.com/support/policies/oracle-support.html
So, coming back to AVS/N1KV being turned off...
It would be a really interesting conversation that this vendor would have with customers who have deployed AVS and N1KV if they decided to turn APIs off.
Ivan (@ioshints), would be nice if you spent some time with someone who actually breathes ACI and/or validate it in detail yourself.
You make some comments here that seem to be assumptions like "would degrade Cisco ACI" that would be nice to validate.
Further, Ivan(@ioshints), you make it sound like closing down APIs is a good thing or an inevitable fact of life.
As for "It would be a really interesting conversation that this vendor would have with customers who have deployed AVS and N1KV if they decided to turn APIs off." - they did that with dvFilter API which was used by way more products. There must have been heated conversations. I haven't heard about them from any of my customers.
"would be nice if you spent some time with someone who actually breathes ACI" << I did ;)
"you make it sound like closing down APIs is a good thing" << where did I do that?
"or an inevitable fact of life." << That's probably true. BTW, what happened to OnePK API?
I was one of the founders of ACI and very disappointing to see how it is represented here. I'll stay away from NSX comparison but want to highlight value prop of ACI.
In datacenter, you have VMs, Bare metal (storage and servers) and evolving into containers in a VM, or containers nativly in BM.
Now think about managing networking/vSwitches in vSphere, Hyper-V, KVM for your VMs - may be you have one type or you are considering to deploy another vendor, physical switches for bare metals, container networking, and then vxlan-vlan gatewatey for BM-<>VM connectivity etc.
Not only managing complexity but then operation, troubleshooting and policy consistency. And then making sure it all works at scale and hopefully doesn't change when new technology like container is to be deployed.
Please read one of my blogs:
http://blogs.cisco.com/datacenter/aci-the-sdn-purpose-built-for-data-center-operations-and-automation
At the end, customer validate the technology and I'm very proud to hear from customers that the ACI is game changer.
I'll close with one paragraph from my blog:
"ACI was architected to provide a unified network independent of the type of workload – bare metal, virtual machine, or container. At the end of the day, workloads are IP endpoints and two IP end points want to connect. They need load balancing, security, and other services. Their life cycle may be static, may need migration, or simply a quick start/stop. ACI handles it all."
Thanks for the comment. Could we please focus on what I wrote and not what Cisco's engineers read between the lines, or what marketing messages they might like to propagate?
Please note that there are no generic comments on ACI applicability, or how your customers might perceive it in this blog post, it deals with the potential impact of not having AVS in vSphere anymore.
Whoever claims that you can get the exact same functionality in vSphere environment without AVS is mistaken, as I'll explain in a follow-up blog post.
Finally, you might want to watch the Networking Field Day videos (including the latest ones from CLEUR) to get a wider perspective on my ACI views.
Kind regards,
Ivan
First, AVS is an optional component of ACI. Let me give an example of using vDS and implementing micro-segmentation.
And before I do that, let me ask you, why micro-segmentation is only for VMware Vms, why bare metals and containers, and other hypervisors don't deserve that treatment? That's what exactly customers get when they deploy ACI, a consistent policy, a consistent micro-segmentation, across all workloads.
EPG is micro-segments in ACI. Define any number of them and put them under same BD (a segment). Put contract/policy across them. When using VMware, each EPG will be pushed as separate port-groups and policy enforced by EPG. At the same time flood semantic of BD is maintained across. Same thing works with bare metal, containers, Hyper-V etc.
Here is my blog on micro-seg:
http://blogs.cisco.com/datacenter/microsegmentation
There are other ways to achieve that in ACI using private vlans but let's leave it for some other time.
If you say "That would degrade Cisco ACI used in vSphere environments into a smarter L2+L3 data center fabric." Citing this without deep dive is what I'm not happy about.
I'm not going to compare feature by feature here; please do a hands on or talk to customers who deployed this.
BTW, I'm no longer in Cisco and now I only get news like everybody else does. Here is what I saw yesterday
https://www.linkedin.com/feed/update/urn:li:activity:6250791574444347392/
Cheers
Praveen
http://blogs.cisco.com/datacenter/never-better-time-for-cisco-aci-enabled-data-center-providers
you can be sure that it's not because VMware is closing APIs that we won't propose any alternative architecture to the AVS.
It's clearly not the "death" of AVS, and customers currently running AVS can be guaranteed that we'll do everything required to maintain their commitment to ACI and add AVS "services" to their environment post vsphere 6.5u1.
The functionality equivalent to having full control of the virtual switch is definitely not the same. Loosing a host-based local switching and routing is one of the examples. But in this case as Nillo mentioned the traffic optimization here is not the big issue from a practical standpoint. What is a bigger disadvantage from the real-life scenario is a lack of service insertion into the L2 path or intra-L3 subnet path. In theory ACI can use usegments and use contracts between them but still no service insertion into the L2 path. What’s more usegments with so many contracts do not scale even on the next gen 93180YC-EX. In comparison NSX do this natively with many security vendors. Without the effective micro segmentation ACI cannot offer spoofguard to protect against man in the middle attacks or IP address spoofing. The best switch on the market will no see flows which are switched locally on the host. How the security can be fully provided if there is no full visibility of applications running on hosts? So security I would say is the biggest concern in terms of the solution which is hw-switch based not vswitch based.
To anticipate some arguments:
Yes, NSX-v is for vSphere and NSX-T is for multihypervisor.
Yes, I also recommend hardware based solutions where critical apps are on the bare metal.
Yes, Visibility of the underlay is not an issue.
Yes, Virtual firewalls plays a vital role to protect apps in the virtual environment.
Yes, Both NSX and ACI have its own pros and cons. :)
But it seems that with VMXNET3 driver support, DirectPath I/O, SR-IOV, and VXLAN+TCP offload support, we've got enough network performance that maybe it's time to shift all the vSwitch bloat into a VM.
vArmour gets by just fine completely with VM-based switching -- they use one VM on each host for traffic interception and redirection. Some traffic is redirected w/VXLAN to the main firewall VMs. Other traffic is forwarded directly to the uplinks or other VMs on the host. Their approach is totally agnostic to the hypervisor vSwitch, so it works with every hypervisor and cloud environment. It doesn't have the raw speed of heavily-coupled hardware/hypervisor forwarding, but it's close enough that the only ones who will raise a stink about it are the sales engineers.
Cisco did the same thing with Virtual Topology System (VTS) -- it began as a totally decoupled software VXLAN engine with distributed L2 control plane -- perfect for extending L2 between different hypervisors located anywhere in the world. They've since spun VTS into something slightly different, but maybe Cisco should bring back and reemphasize its hypervisor-agnostic switching capabilities.
And you're right about the performance. With some of Intel's latest libraries, you can make userspace VMs run pretty fast from an I/O perspective. 8+ Gbps performance per vCPU with application ID and verbose logging with 1500 byte MTU.
Im a true ACI and NSX fan, also VCIX-NV, but I really don´t like that VMware closed the APIs for 3rd party integration, its a bit coward-ish facing the competitors technology that way. a real potential of SDN is in how you handle the Hypervisor Virtual Switch, and both Cisco and Nuage need to completely change the strategy.
My question is: Does anyone know if VMware just stopped supporting the solutions with a 3rd party virtual switch, or are the APIs closed all together? Could we, as a System Integrator, support vCenter+AVS?
บาคาร่า
gclub จีคลับ
gclub casino
interesting blog and comments.
As we're 9 months down the line, what is the situation now?
AVS will not work at all in vSphere 6.5 ?
Are Cisco working on a VM-based v-switch to replace AVS?
Raw speed is not important to me, I'd rather have the functionality of an ACI-integrated vswitch.
I'm not interested in using VMware's VDS, not in the slightest (I hope VMware are reading this!).
https://supportforums.cisco.com/t5/application-centric/aci-integration-with-vm/td-p/3030792