The following design challenge landed in my Inbox not too long ago:
My organization is the in the process of building a completely new data center from the ground up (new hardware, software, protocols ...). We will currently start with one site but may move to two for DR purposes. What DC technologies should we be looking at implementing to build a stable infrastructure that will scale and support technologies you feel will play a big role in the future?
In an ideal world, my answer would begin with “Start with the applications.”
A comment left on my dense-mode FCoE post is a perfect example of the dangers of using vague, marketing-driven and ill-defined word like “switching”. The author wrote: “FC-SW is by no means routing ... Fibre Channel is switching.” As I explained in one of my previous posts, switching can mean anything, from circuit-based activities to bridging, routing and even load balancing (I am positive some vendors claim their load balancers ... oops, application delivery controllers ... are L4-L7 switches), so let’s see whether Fibre Channel “switching” is closer to bridging or routing.
J Michel Metz brought out an interesting aspect of the dense/sparse mode FCoE design dilemma in a comment to my FCoE over Trill ... this time from Juniper post: FC-focused troubleshooting. I have to mention that he happens to be working for a company that has the only dense-mode FCoE solution, but the comment does stand on its own.
Before reading this post you might want to read the definition of dense- and sparse-mode FCoE and a few more technical details.
A tweet from J Michel Metz has alerted me to a “Why TRILL won't work for data center network architecture” article by Anjan Venkatramani, Juniper’s VP of Product Management. Most of the long article could be condensed in two short sentences my readers are very familiar about: Bridging does not scale and TRILL does not solve the traffic trombone issues (hidden implication: QFabric will solve all your problems)... but the author couldn’t resist throwing “FCoE over TRILL” bone into the mix.
A while ago someone asked me whether I think FC-over-MPLS would be a good PhD thesis. My response: while it’s always a good move to combine two totally unrelated fields in your PhD thesis (that almost guarantees you will be able to generate several unique and thus publishable articles), FCoMPLS might be tough because you’d have to make MPLS lossless. However, where there’s a will, there’s a way ... straight from the haze of the “Just because you can doesn’t mean you should” cloud comes FC-BB_PW defined in FC-BB-5 and several IETF drafts.
My first brief encounter with FCoMPLS was a twitxchange with Miroslaw Burnejko who responded to my “must be another lame joke” tweet with a link to a NANOG presentation briefly mentioning it and an RFC draft describing the FCoMPLS flow control details. If you know me, you have probably realized by now that I simply had to dig deeper.
During one of the iSCSI/FC/FCoE tweetstorms @stu made an interesting claim: FC scales to thousands of nodes; iSCSI can’t do that.
You know I’m no storage expert, but I fail to see how FC would be inherently (architecturally) better than iSCSI. I would understand someone claiming that existing host or storage iSCSI adapters behave worse than FC/FCoE adapters, but I can’t grasp why properly implemented iSCSI network could not scale.
Am I missing something? Please help me figure this one out. Thank you!
A few days after writing my ATAoE post I got a very nice e-mail from Sam Hopkins from Coraid responding to every single point I’ve raised in my post. I have to admit I’ve missed the tag field in the ATAoE packets which does allow parallel requests between a server and a storage array, solving some of the sequencing/fragmentation issues. I’m still not convinced, but here is the whole e-mail (I did just some slight formatting) with no further comments from my side.
When I started writing about storage industry and its attempts to tweak Ethernet to its needs, someone mentioned ATAoE. I read the ATAoE Wikipedia article and concluded that this dinky technology probably makes sense in a small home office ... and then I’ve stumbled across an article in The Register that claimed you could run a 9000-user Exchange server on ATAoE storage. It was time to deep-dive into this “interesting” L2+7 protocol. As expected, there are numerous good reasons you won’t hear about ATAoE in my Data Center 3.0 for Networking Engineers webinar and I described a few of them in a blog post I wrote to SearchNetworking’s Fast Packet blog.
I’m writing this post while travelling to the Net Field Day 2010, the successor to the awesome Tech Field Day 2010 during which the FCoTR technology was launched. It’s thus only fair to extend that fantastic merger of two technologies we all love, look at the bigger picture and compare storage networking with SNA.
- If you’re too young to understand what I’m talking about, don’t worry. Yes, you’ve missed all the beauties of RSRB/DLSw, CIP, APPN/APPI and the likes, but major technology shifts happen every other decade or so, so you’ll be able to use FC/FCoE/iSCSI analogies the next time (and look like a dinosaur to the rookies). Make sure, though, that you read the summary.
- I’ll use present tense throughout the post when comparing both environments although SNA should be mostly history by now.
The FCoE confusion spread by networking vendors has reached new heights with contradictory claims that you need TRILL to run multihop FCoE (or maybe you don’t) and that you don’t need congestion control specified in 802.1Qau standard (or maybe you do). Allow me to add to your confusion: they are all correct ... depending on how you implement FCoE.
The storage industry has a very specific view of the networking protocols – they expect the network to be extremely reliable, either by making it lossless or by using a transport protocol (TCP + embedded iSCSI checksums) that was only recently made decently fast.
Some of their behavior can be easily attributed to network-blindness and attempts to support legacy protocols that were designed for a completely different environment 25 years ago, but we also have to admit that the server-to-storage sessions are way more critical than the user-to-server application sessions.
A few days ago I wrote that you should always strive to understand the technologies beyond the reach of your current job. Stephen Foskett is an amazing example to follow: although he’s a storage guru, he knows way more about HTTP than most web developers and details of the web server architecture that most server administrators are not aware of. Read his High-Performance, Low-Memory Apache/PHP Virtual Private Server; you’ll definitely enjoy the details.
And then there’s the ultimate weekend fun: reading Greg’s perspectives on storage and FCoE. It starts with his Magic of FCoTR post (forget the FCoTR joke and focus on the futility of lossless layer-2 networks) and continues with Rivka’s hilarious report on the FCoTR progress. Oh, and just in case you never knew what TR was all about – it was “somewhat” different, but never lossless, so it would be as bad a choice for FC as Ethernet is.
Last but not least, there’s Kevin Bovis, the veritable fountain of common sense, this time delving with the ancient and noble art of troubleshooting. A refreshing must-read.
My next webinar is covering the topics I wanted to address for over half a year, but never found time to do it: the facts and the hype in the Data Center as seen from the perspective of a networking engineer. I was planning to run it sometime in autumn, but the recent TRILL-focused hype has prompted me to schedule it sooner.
In two hours (probably slightly more, my two-hour webinars are usually almost three hours long), we’ll cover:
- Storage technologies and protocols, including iSCSI and FCoE;
- Server virtualization, including virtual machine mobility and high-availability;
- Differences between routing and bridging (including VLANs and PVLANs) and the need for layer-3 routing in Data Center;
- Emerging technologies, including TRILL, DCB, L2MP and LISP;
- Multi-site considerations and transport options (DWDM, VPLS, MPLS/VPN and OTV).
Follow this link to register for the Data Center webinar ... and let me conclude with excellent news: this is the first time I’ve got multiple registrations before even announcing the webinar.
A month ago Stephen Foskett complained about lack of Microsoft’s support for FCoE. I agree with everything he wrote, but he missed an important point: Microsoft gains nothing by supporting FCoE and potentially a lot if they persuade people to move away from FCoE and deploy iSCSI.
FCoE and iSCSI are the two technologies that can help Fiber Channel gain its proper place somewhere between Tyrannosaurus and SNA. FCoE is a more evolutionary transition (after all, whole FCoE frames are encapsulated in Ethernet frames) and thus probably preferred by the more conservative engineers who are willing to scrap Fiber Channel infrastructure, but not the whole protocol stack. Using FCoE gives you the ability to retain most of the existing network management tools and the transition from FC to FCoE is almost seamless if you have switches that support both protocols (for example, the Nexus 5000).
You’ll find introduction to SCSI, Fiber Channel, FCoE and iSCSI in the Next Generation IP Services webinar.
A while ago I was criticizing the network-blindness of the storage industry that decided to run 25-year old protocol (SCSI) over the most resource-intensive transport protocol (TCP) instead of fixing their stuff or choosing a more lightweight transport mechanism. My argument (although theoretically valid) became moot a few months ago: Intel and Microsoft have demonstrated an iSCSI solution that can saturate a 10GE link and perform more than 1 million I/O operations per second. Another clear victory for the Moore’s Law.
You’ll find introduction to SCSI, Fiber Channel, iSCSI and server virtualization in the Next Generation IP Services webinar.