Blog Posts in May 2011
The Edge Virtual Bridging (EVB; 802.1Qbg) standard solves two important layer-2-based virtualization issues:
- Automatic provisioning of access switches based on hypervisor-signaled information (discussed in the EVB eases VLAN configuration pains article)
- Multiplexing of multiple logical 802.1Q links over a single physical link.
Logical link multiplexing might seem a solution in search of a problem until you discover that VMware-related design documents usually recommend using 6 to 10 NICs per server – an approach that either wastes switch ports or is hard to implement with blade servers’ mezzanine cards (due to limited number of backplane connections).
Just got this question from one of my Service Provider friends: “If I am building a new MPLS backbone from scratch, should I design it with Carrier’s Carrier in mind?” Of course you should ... after all, the CsC functionality has almost no impact on the MPLS backbone (apart from introducing an extra label in the label stack).
The article of the week is indubitably (how’s that for a Scrabble word?) RFC 6250, describing some of the stupidities (politely called misconceptions) TCP/IP stack had to deal with.
Other great content I stumbled upon during this week (in random order):
Christoph Jaggi published a new revision of his excellent Ethernet Encryptor Market Overview documents: introduction, point-to-point and multipoint products. All you ever wanted to know about encryption (and much more).
... as some of its supporters seem to believe every now and then (I do get severe allergic reaction when someone claims it will change the laws of physics or when I’m faced with technical inaccuracies not to mention knee-jerking financial experts). Even more, assuming it can cross the adoption gap, it could fundamentally change the business models of networking vendors (maybe not in the way you’d like them to be changed). You can read more about my OpenFlow views in the article I wrote for SearchNetworking.
On the more technological front, I still don’t expect to see miracles. Most OpenFlow-related ideas I’ve heard about have been tried (and failed) before. I fail to see why things would be different just because we use a different protocol to program the forwarding tables.
Jason sent me an interesting question a few days ago: “assuming a vSwitch *did* support MPLS/VPN PE router functionality, what type of protocol support would be needed on the access layer switches?”
While the MPLS/VPN support in hypervisor switches remains in the realm of science fiction, it’s worth knowing that there are at least five different transport options you can use between PE-routers. Here they are, from the most decoupled to the most tightly coupled ones:
In an amazing coincidence, Amazon launched IPv6-enabled Elastic Load Balancing just hours before my Enterprise IPv6 – the first steps webinar (you can still register for an online session) in which I describe (among other things) how you can make your IPv4-only content reachable over IPv6 with NAT64 or with 6-to-4 load balancing.
Two months ago I wrote the Data Center Fabric Architectures post jokingly defining Borg and Big Brother architectures. In the meantime, a number of vendors have launched (or announced) their fabric products and the post badly needed an update.
I decided to move the updated text to my main web site (where it will be easier to edit), wrote an introductory section, removed a few tongue-in-cheek comments (after all, it’s time to get serious if Cisco’s Data Center blog links to your article) and added numerous links to in-depth articles and examples of individual architectures.
Nosx added a very valid point-of-view to my MPLS/VPN Common Services Design post:
This is an overly complex and unsupportable approach to shared services. Having to touch thousands of VRFs to create a shared services VPN is unacceptable. The correct approach is to touch only the "services" vrf, and import/export to each RT that you wish to insert the services into.
As always, the right answer is “it depends.” If you have few large customers, it makes way more sense to add their RTs to the common services VRF. If you have many small customers, adding RTs to the common services VRF does not scale.
When I was explaining stateless and stateful NAT64 a while ago, I compared them to what we used to know as NAT (L3-only translation) and PAT (per-L4-session translation). Wrong. Even L3-only NAT44 needs some state (inside-to-outside IPv4 address mapping).
Stateless NAT64 is truly stateless: it uses a deterministic algorithm to convert IPv4 addresses into specially crafted IPv6 addresses (a good diagram is included in IOS XE documentation). It’s also mostly useless.
My Inbox is (yet again) overflowing with great links.
Tassos is describing the DHCPv6 prefix delegation nightmare in great details.
Cut Me Some SLAAC, Or Why You Need RA Guard by Tom the Networking Nerd describes the details of the recently-hyped SLAAC vulnerability. Conclusion: work with a vendor that knows a bit about fixing security problems.
I guess we all heard the fundamentalist IPv6 mantra by now: “Every subnet gets a /64.” Being a good foot soldier, I included it in my Enterprise IPv6 webinar (last live session in 1H2011 in a few days – register here). Time to fix that slide and admit what we also knew for a long time: IPv6 is classless and we have yet to see the mysterious device that dies in flames when sniffing a prefix longer than a /64.
Jeroen sent me an interesting challenge: he would like to reload the router when the 3G WAN interface gets stuck (I thought my Nokia phone is the only one exhibiting this problem, but obviously I was wrong). The reload-on-failed-ping EEM applet I’ve published would be a perfect solution, but it uses track delay and the maximum delay timeout is three minutes, while Jeroen would like to wait 15 minutes before reloading the router.
The Common Services MPLS/VPN topology is the topology in which multiple customers access the same common servers without being able to access each other’s networks. You can implement this requirement with judicious use of inter-VRF NAT or with controlled route leaking between customers’ and common services VRFs (assuming the customers don’t use overlapping address space).
I got totally fed up with the currently popular “flat-earth with long-distance bridging” architecture paradigm while developing the Data Center Interconnects webinar. It all started with the layer-2 hypervisor switches and lack of decent L3 network-side solutions; promoting non-scalable cloudy solutions doesn’t help either.
The network infrastructure would scale better if the hypervisors would work as MPLS/VPN PE-routers, but even MPLS would hit scalability limits when the number of servers grows into tens of thousands. The only truly scalable solution is IP-over-IP or MAC-over-IP implemented in the hypervisor switches.
I tried to organize all these thoughts in the “How to build a scalable IaaS cloud network infrastructure” article that was recently published by SearchTelecom ... and just a few days after the article was published, Brad Hedlund pointed me to Infrastructure as a Service Builder’s Guide document, which is saying almost the same thing (and coming to flawed conclusions because they had to promote OpenFlow and NEC).
Brandon Carroll has recently launched MyIPv6Tutor.com, an e-learning program targeted at engineers that want to learn IPv6 basics at their own pace. I know Brandon is an excellent instructor, so there are at least four IPv6 training options I can wholeheartedly recommend.
A while ago I described what it takes to integrate TRILL backbone with the legacy equipment running Spanning Tree Protocol (STP). Unfortunately, Brocade decided to use a non-standard approach to BPDU handling when implementing their TRILL-like VCS fabric. VDX switches running in fabric mode can either drop incoming BPDU frames or transport them transparently across the fabric to other edge ports. Although VDX switches support STP, RSTP and MSTP (as well as RootGuard and BPDUGuard) when configured as standalone switches, the STP processing is disabled when you configure fabric mode; VCS fabric looks like a huge shared LAN segment to the end hosts and core switches.
2013-03-31: Network OS 4.0 and above supports Distributed Spanning Tree (DiST), for more details read this blog post.
HP’s FlexNetwork architecture launch at Interop has received mixed responses: from pretty positive from Tom the Networking Nerd to cautiously optimistic from Greg (Etherealmind) Ferro and more cautious analysis by Shamus McGillicuddy. For a grumpy skeptic’s take, read my FastPacket blog post.
Few days ago I enjoyed listening to the Teredo-bashing Packet Pushers Podcast during which Greg & the crew simply couldn’t avoid NAT64. Tom even wrote a follow-up post explaining why NAT is bad (we all agree with that) and why we shouldn’t use it in IPv6. Unfortunately he missed the elephant in the room: it’s all about the legacy content. IPv6-only residential users have to access IPv4-only content.
Frequent eruptions of OpenFlow-related hype (a recent one caused by Brocade Technology Day Summit; I’m positive Interop will not lag behind) call for a continuous myth-busting efforts. Let’s start with a widely-quoted (and immediately glossed-over) fact from Professor Scott Shenker, a founding board member of the ONF: “[OpenFlow] doesn't let you do anything you couldn't do on a network before.”
Whenever I write about vCloud Director Networking Infrastructure (vCDNI), be it a rant or a more technical post, I get comments along the lines of “What are the network guys going to do once the infrastructure has been provisioned? With vCDNI there is no need to keep network admins full time.”
Once we have a scalable solution that will be able to stand on its own in a large data center, most smart network admins will be more than happy to get away from provisioning VLANs and focus on other problems. After all, most companies have other networking problems beyond data center switching. As for disappearing work, we've seen the demise of DECnet, IPX, SNA, DLSw and multi-protocol networks (which are coming back with IPv6) without our jobs getting any simpler, so I'm not worried about the jobless network admin. I am worried, however, about the stability of the networks we are building, and that’s the only reason I’m ranting about the emerging flat-earth architectures.
Occasionally I get e-mails from readers that can’t believe my description of yearly webinar subscription is correct. A few days ago I got this set of questions:
If I pay the $199.00 does that mean I have access to ALL of your webinars?
Absolutely, all sixteen of them (with new ones being added every two or three months). And don’t forget you also get unlimited access to all live webinars.
Update 2011-05-05 16:50UTC: Added VN-Link/802.1Qbh
Challenge: If you want to deploy virtual machines belonging to different security zones within the same physical host, you have to isolate them. VLANs are the most common approach. If you want to migrate a running VM from one host to another while preserving its user sessions, you usually have to rely on bridging. The set of VLANs needed on a trunk link between the hypervisor host and access switch is thus unpredictable (more information in my VMware Networking Deep Dive webinar)
Solution#1 (painful): Configure all possible VLANs on the trunk link. Stretched VLANs spanning the whole data center are an ideal ingredient of a major meltdown.
On June 8th (the World IPv6 Day) you’ll see Facebook, Google and a number of other web sites reachable over IPv4 and IPv6 (more accurately: the DNS records for their web sites will have both A and AAAA records). No problem ... unless your users have misconfigured workstations and you haven’t deployed IPv6 throughout your network yet (not many have).
Users with broken IPv6 connectivity will experience long delays connecting to major public web sites. Their workstations will try to reach the content over IPv6 first and will have to experience a TCP-level timeout before retrying to get the same content over IPv4. Guess whose phone will ring ... and what the problem description will be ;)
A few days ago I was discussing a data center design with a seasoned network architect and during the MPLS discussions he made an offhand remark “there are still some switches running OSPF and using network 0.0.0.0 and redistribute connected.” My first thought was “this can’t be good” but I had no idea how bad it is until I ran a lab test.
The generic dilemma along the lines of “should I make connected interfaces part of my OSPF process (and make them passive) or should I redistribute them into OSPF” has no clear-cut answer (apart from the obvious “it depends”) ... and Google will quickly find you tons of lengthy discussions.
Summary for differently attentive: A hub router failure in multi-hub DMVPN networks can cause spoke-to-spoke traffic disruptions that last up to three minutes.
Almost every DMVPN design I’ve seen has multiple hubs for redundancy purposes. I’ve always preached the “one hub per DMVPN tunnel” mantra (see the diagram below) to those who were willing to listen citing “NHRP issues after hub failure” as one of the main reasons you should not have two or more hubs per DMVPN tunnel.
Working on the May Day feels like an oxymoron, but Sundays are about the only time I can clean up my overflowing Inbox.
The best post I’ve stumbled across recently is undoubtedly 38 life lessons I’ve learned in 38 years (thank you, @greg_meehan). I will try to remember the slow down one. Another great one: Managing IT people from Storagebod. Been there, seen that (and failed a few times).
And here’s the usual long list of links: