Gartner has updated their networking hype cycle. Not surprisingly:
- Ethernet switching fabrics are on the slope of enlightenment (finally – we’ve been educating networking engineers on what they really are for half a decade);
- SDN is well on its way into the trough of disillusionment (shameless plug: I guess not enough people attended real-life SDN workshops) and whitebox switching is going the same way;
- SD-WAN is nearing the peak of inflated expectations;
- FCoE and Long-Distance vMotion will be dead before they reach the plateau.
A while ago Cisco added dynamic FCoE support to Nexus 5000 switches. It sounded interesting and I wanted to talk about it in my Data Center Fabrics update session, but I couldn’t find any documentation at that time.
In the meantime, the Configuring Dynamic FCoE Using FabricPath configuration guide appeared on Cisco’s web site and J Metz wrote a lengthly blog post explaining how it all works, triggering a severe attack of déjà vu.
One of my readers wanted to deploy FCoE on UCS in combination with Nexus 1000v and wondered how the FCoE traffic impacts QoS on Nexus 1000v. He wrote:
Let's say I want 4Gb for FCoE. Should I add bandwidth shares up to 60% in the nexus 1000v CBWFQ config so that 40% are in the default-class as 1kv is not aware of FCoE traffic? Or add up to 100% with the assumption that the 1kv knows there is only 6Gb left for network? Also, will the Nexus 1000v be able to detect contention on the uplink even if it doesn't see the FCoE traffic?
As always, things aren’t as simple as they look.
One of my regular readers sent me a long list of FCoE-related questions:
I wanted to get your thoughts around another topic – iSCSI vs. FCoE? Are there merits and business cases to moving to FCoE? Does FCoE deliver better performance in the end? Does FCoE make things easier or more complex?
He also made a very relevant remark: “Vendors that can support FCoE promote this over iSCSI; those that don’t have an FCoE solution say they aren’t seeing any growth in this area to warrant developing a solution”.
Chris Marget recently asked a really interesting question:
I've encountered an environment where the iSCSI networks are built just like FC networks: Multipathing software in use on servers and storage, switches dedicated to "SAN A" and "SAN B" VLANs, and full isolation of paths (redundant paths) between server and storage. I understand creating a dedicated iSCSI VLAN, but why would you need two? Isn’t the whole thing running on top of TCP? Am I missing something?
Well, it actually makes sense in some mission-critical environments.
Update 2015-12-06: Ethernet checksums are not a workaround for lack of iSCSI-level checksums. If your iSCSI solution doesn't support application-level checksum, your data might be at risk
Matt Thompson provided a really good answer to the “what’s acceptable oversubscription ratio in a ToR switch” when he wrote “I’m expecting a ‘how long is a piece of string’ answer” (note: do watch the BBC video answering that one).
There’s the 3:1 rule-of-thumb recipe, with a more realistic answer being “it depends”. Now let’s see if we can go beyond that without a deep dive into scholastic waters.
Just prior to Networking Field Day, the merry band of geeks sat down with Chip Copper, Brocade’s Solutioneer (a job title almost as good as Packet Herder) to discuss the intricate details of VCS Fabric. The videos are well worth watching – the technical details are interesting, but above all, Chip is a fantastic storyteller.
Should you use FC, FCoE or iSCSI when deploying new gear in your existing data center? What about Greenfield deployments? What are the decision criteria? Should you just skip iSCSI and use NFS if you’re focusing on server virtualization with VMware? Does it still make sense to build separate iSCSI network? Are jumbo frames useful? We’ll try to answer all these questions and a few more in the first Data Center Virtual Symposium sponsored by Cisco Systems.
Anyone serious about high-availability connects servers to the network with more than one uplink, more so when using converged network adapters (CNA) with FCoE. Losing all server connectivity after a single link failure simply doesn’t make sense.
If at all possible, you should use dynamic link aggregation with LACP to bundle the parallel server-to-switch links into a single aggregated link (also called bonded interface in Linux). In theory, it should be simple to combine FCoE with LAG – after all, FCoE runs on top of lossless Ethernet MAC service. In practice, there’s a huge difference between theory and practice.
The last week has been “interesting” – I created the draft slide deck for the Data Center Fabric Architectures webinar on November 16th (register here) and sent the relevant slides to all vendors mentioned in the presentation to give them a chance to fix my errors – every vendor got at least the scorecard describing my understanding of their solution.
As you might know, I did a Data Center Fabric Architectures presentation at the EuroNOG 2011 conference last week. The slides are available online; for a more in-depth version, please register for my Data Center Fabrics webinar.
As always, if I got anything wrong, please write a comment.
The following design challenge landed in my Inbox not too long ago:
My organization is the in the process of building a completely new data center from the ground up (new hardware, software, protocols ...). We will currently start with one site but may move to two for DR purposes. What DC technologies should we be looking at implementing to build a stable infrastructure that will scale and support technologies you feel will play a big role in the future?
In an ideal world, my answer would begin with “Start with the applications.”
J Michel Metz brought out an interesting aspect of the dense/sparse mode FCoE design dilemma in a comment to my FCoE over Trill ... this time from Juniper post: FC-focused troubleshooting. I have to mention that he happens to be working for a company that has the only dense-mode FCoE solution, but the comment does stand on its own.
Before reading this post you might want to read the definition of dense- and sparse-mode FCoE and a few more technical details.
A tweet from J Michel Metz has alerted me to a “Why TRILL won't work for data center network architecture” article by Anjan Venkatramani, Juniper’s VP of Product Management. Most of the long article could be condensed in two short sentences my readers are very familiar about: Bridging does not scale and TRILL does not solve the traffic trombone issues (hidden implication: QFabric will solve all your problems)... but the author couldn’t resist throwing “FCoE over TRILL” bone into the mix.
Intel announced its Open FCoE (software implementation of FCoE stack on top of Intel’s 10GB Ethernet adapters) using the cloudy bullshit bingo including simplifying the Data Center, Free New Technology, Cloud Vision and Green Computing (ok, they used Environmental impact) and lots of positive supporting quotes. The only thing missing was an enthusiastic Gartner quote (or maybe they were too expensive?).