Blog Posts in October 2011
Michael sent me an interesting question:
I work in a rather large enterprise facing a campus network redesign. I am in favor of using a routed access for floor LANs, and make Ethernet segments rather small (L3 switching on access devices). My colleagues seem to like L2 switching to VSS (distribution layer for the floor LANs). OSPF is in use currently in the backbone as the sole routing protocol. So basically I need some additional pros and cons for VSS vs Routed Access. :-)
The follow-up questions confirmed he has L3-capable switches in the access layer connected with redundant links to a pair of Cat6500s:
The last few days were exquisite fun: it was great meeting so many people focusing on a single technology (OpenFlow) and concept (Software-Defined Networking, whatever that means) that just might overcome some of the old obstacles (and introduce new ones). You should be at least a bit curious what this is all about, and even if you don’t see yourself ever using OpenFlow or any other incarnation of SDN in your network, it never hurts to enhance your resume with another technology (as long as it’s relevant; don’t put CICS programmer at the top of it).
We finished a fantastic Network Field Day (second edition) yesterday. While it will take me a while (and 20+ blog posts) to recover from the information blast I received during the last two days, here are the first impressions:
Explosion of innovation – and it’s not just OpenFlow and/or SDN. Last year we’ve seen some great products and a few good ideas (earning me the “grumpy old man that’s hard to make smile” fame), this year almost every vendor had something that excited me.
A while ago I got a set of MPLS/VPN-related questions from one of my long-time readers furiously working on a response to a large RFP. I answered the questions and (more as an afterthought) mentioned the ExpertExpress service I had been starting to consider. His response amazed me:
ExpertExpress is definitely a very very good idea!!! You know what? I think I will push the company to try to use it to get your advice on the current engagement. The company needs this "yesterday" so I would be able to verify my design and will feel safer with it and will deliver it on time and of course you will receive a fair payment for this.
Next question – when could we do it? Response: how about tomorrow? Sure, no problem (note: it doesn’t always work out that way).
Like every other blogger, I get occasional e-mails from people fishing for free consulting or second opinion (note: asking a serious technical question is a totally different story; as many people know, I always try to reply and help) and as I’m totally overloaded with OpenFlow symposium and Net Field Day these days, I decided to share one of the better ones.
The second presentation I had @ EuroNOG 2011 described networking requirements of various cloud services – a perfect combination of virtual networking and scalability, two of the topics I’m currently obsessed with. If you’re looking for more details, register for the Cloud Computing Networking: Under the Hood webinar on December 14th.
Initial release of QFabric Junos can run STP only within the network node (see QFabric Control Plane post for more details), triggering an obvious question: “what happens if a server multihomed to a server node starts bridging between its ports and starts sending BPDUs?”. Some fabric solutions try to ignore STP (the diplomats would say “they are transparent to STP”) but fortunately Juniper decided to do the right thing.
While everyone deeply involved with OpenFlow agrees it’s just a low-level tool that can’t solve problems we couldn’t solve in the past (just like replacing Tcl with C++ won’t help you prove P = NP), occasionally you stumble across mindboggling ideas that are so simple you have to ask yourself: “were we really that stupid?” One of them that obviously impressed James Hamilton is the solution to load balancing that requires no load balancers.
Before clicking Read more, watch this video and try to figure out what the solution is and why we’re not using it in large-scale networks.
After more than a year, I’m back in California, anxiously waiting to meet my fellow bloggers and ask some tough questions to a fantastic lineup of vendors presenting at Net Field Day 2011. Stephen Foskett’s well-oiled organizing machinery is already in full gear; I’m typing this post from a WiFi-equiped car that picked me up @ SFO airport (you see, dear vendors, it’s so easy to make my inner geek happy ... all I need are some fantastic features that are actually usable and work as well as this WiFi connection).
I spent a lot of time during the last year analyzing various data center architectures proposed by networking vendors – from simple solutions like MLAG to complex architectures like QFabric – and presented the summary of my findings in a short presentation @ EuroNOG 2011 that received positive feedback from Cisco, Juniper and Brocade (and a post from Brook Reams @ EthernetFabric). That presentation (you can view it online) was pretty short due to the 45-minute slot I got and I decided to expand it into a proper 2-to-3 hour webinar.
InformationWeek has recently published an OpenFlow article by Jeff Doyle in which they graced me with a single grumpy quote taken out of three pages of excellent questions that Jeff asked me when preparing for the article. Jeff has agreed that I publish the original questions and my answers to them. Here they are (totally unedited):
This is a nice email I got from an engineer struggling with multi-homing BGP setup:
We faced a problem with our internet routers a few days back. The engineer who configured them earlier used the syntax: network x.x.x.x mask y.y.y.y route-map PREPEND to influence the incoming traffic over two service-providers.
... and of course it didn’t work.
Yesterday New York Times published an article covering Nicira, a semi-stealthy startup working on an open source soft switch (Open vSwitch) and associated OpenFlow-based controller, triggering immediate responses from GigaOm and Twilight in the Valley of the Nerds. While everyone got entangled in the buzzwords (or lack of them), not a single article answered the question “what is Nicira really doing?” Let’s fix that.
One of the areas where IPv6 sorely lacks feature parity with IPv4 is user authentication and source IP spoofing prevention in large-scale Carrier Ethernet networks. Metro Ethernet switches from numerous vendors offer all the IPv4 features a service provider needs to build a secure and reliable access network where the users can’t intercept other users’ traffic or spoof source IP addresses, and where it’s always possible to identify the end customer from an IPv4 address – a mandatory requirement in many countries. Unfortunately, you won’t find most of these features in those few Metro Ethernet switches that support IPv6.
A week ago I was writing about the latency and bandwidth challenges of long-distance vMotion and why it rarely makes sense to use it in disaster avoidance scenarios (for a real disaster avoidance story, read this post ... and note no vMotion was used to get out of harm’s way). The article I wrote for SearchNetworking tackles an idea that is an order of magnitude more ridiculous: using vMotion to migrate virtual machines around the world to bring them close to the users (follow-the-sun workload mobility). I wonder which islands you’d have to use to cross the Pacific in 10ms RTT hops supported by vMotion?
While preparing for the IPv6 seminar I’m delivering in Rome, I had to reinvent a few wheels, including slides explaining IPv6 addressing and host behavior ... giving me a perfect reason to study the RFCs and figure out how exactly IPv6 stateless autoconfiguration (RFC 4862) works.
Greg (@etherealmind) Ferro started an interesting discussion on Google+, claiming MPLS is just tunneling and a duct tape like NAT. I would be the first one to admit MPLS has its complexities (not many ;) and shortcomings (a few ;), but calling it a tunnel just confuses the innocents. MPLS is not tunneling, it’s a virtual-circuits-based technology, and the difference between the two is a major one.
Got this set of questions from a CCIE pondering emerging technologies that could be of potential use in his data center:
I don’t think OpenFlow is clearly defined yet. Is it a protocol? A model for Control plane – Forwarding plane FP interaction? An abstraction of the forwarding-plane? An automation technology? Is it a virtualization technology? I don’t think there is consensus on these things yet.
Every time I’m discussing the VXLAN technology with a fellow networking engineer, I inevitably get the question “how will I connect this to the outside world?” Let’s assume you want to build pretty typical 3-tier application architecture (next diagram) using VXLAN-based virtual subnets and you already have firewalls and load balancers – can you use them?
The product information in this blog post is outdated - Arista, Brocade, Cisco, Dell, F5, HP and Juniper are all shipping hardware VXLAN gateways (this post has more up-to-date information). The concepts explained in the following text are still valid; however, I would encourage you to read other VXLAN-related posts on this web site or watch the VXLAN webinar to get a more recent picture.
I was overloaded during the last few weekends and my Inbox is yet again overflowing with links to excellent content. For a warm-up, look at the eight levels of vendor acceptance (a side effect of a really tough lab test during the EuroNOG 2011 conference).
On a more serious note, the most useful article of this week is probably the BGPmon Web Services API that describes how you can query the global BGP table through whois or SOAP.
Ethan Banks, one of the masterminds behind the Packet Pushers podcast, wrote a spot-on blog describing why enterprises don’t deploy IPv6. Unfortunately, most of the enterprise networking engineers follow the same line of reasoning, and a few of them might feel like the proverbial deer caught in the headlights once something totally unexpected happen ... like their CEO vacationing in China, getting only IPv6 address on the iPhone, and thus not being able to access a mission-critical craplication. For a longer-term perspective, read an excellent reply written by Tom Hollingsworth.
I’ve first heard about CloudSwitch when writing about vCider. It seemed like an interesting idea and I wanted to explore the networking aspects of cloud VLAN extension for my EuroNOG presentation. My usual approach (read the documentation) failed – the documentation is not available on their web site – but I got something better: a briefing from Damon Miller, their Director of Technical Field Operations. So, this is how I understood CloudSwitch works (did I get it wrong? Write a comment!):
As you might know, I did a Data Center Fabric Architectures presentation at the EuroNOG 2011 conference last week. The slides are available online; for a more in-depth version, please register for my Data Center Fabrics webinar.
As always, if I got anything wrong, please write a comment.
There’s so much great material being written on VXLAN and NVGRE that I decided to write a separate post listing the best of it (if you still can’t decide whether you should care about VXLAN, register for my Introduction to Virtual Networking webinar).
The question of high-availability cloud services (let’s agree reliable in this context really means highly available) pops up every time I discuss cloud networking requirements with enterprise-focused experts. While it’s obvious the software- and platform services must be highly available (as their users have few mechanisms to increase their availability), Infrastructure-as-a-Service (IaaS) remains a grey area.
However, once you look at the question from the business perspective, it seems Amazon probably made a pretty good choice: offer reasonably-available service at a low price. You’ll find more in-depth arguments in the Cloud infrastructure services: Balancing high availability and cost article I wrote for SearchCloudProvider.com.
I just came back from a fantastic conference – EuroNOG 2011 in beautiful Krakow. I’ve been to too many conferences in my life, but this one really stood out for two reasons: the nerdiness factor (where it got to the level of advanced presentations @ Cisco Live) and the fantastic crew organizing it.