Blog Posts in February 2013

Evolution of IP Model

I stumbled upon a fantastic RFC - Evolution of IP Model (RFC 6250) - that should be made mandatory reading for everyone remotely involved with networking. It describes numerous "truths" (politely called misconceptions) that everyone from programmers to network designers still rely upon. Some of my favorites: reachability is symmetric and transitive, loss is rare, addresses are stable, each host has a single interface and a single IP address ... Enjoy!
see 2 comments

Example: Multi-Stage Clos Fabrics

Smaller Clos fabrics are built with two layers of switches: leaf and spine switches. The oversubscription ratio you want to achieve dictates the number of uplinks on the leaf switch, which in turn dictates the maximum number of spine switches and thus the fabric size.

You have to use multi-stage Clos architecture if you want to build bigger fabrics; Brad Hedlund described a sample fabric with over 24.000 server-facing ports in the Clos Fabrics Explained webinar.

see 4 comments

Virtual Tenant Networks with NEC ProgrammableFlow

Virtual tenant networks are one of the best features of NEC ProgrammableFlow solution – you can build virtual layer-2 subnets (based on VLANs, edge ports or port/VLAN combos), connect them with a virtual router, and implement packet filters and traffic steering ... while treating the whole data center fabric as a single device.

Even better, the ingress edge switch performs all the operations you configure (ACLs, L2 lookup, L3 lookup, source/destination MAC rewrite), resulting in optimal end-to-end forwarding.

add comment

WAN Routing in Data Centers with Layer-2 DCI

A while ago I got an interesting question:

Let's say that due to circumstances outside of your control, you must have stretched data center subnets... What is the best method to get these subnets into OSPF? Should they share a common area at each data center or should each data center utilize a separate area for the same subnet?

Assuming someone hasn’t sprinkled the application willy-nilly across the two data centers, it’s best if the data center edge routers advertise subnets used by the applications as type-2 external routes, ensuring one data center is always the primary entry point for a specific subnet. Getting the same results with BGP routing in Internet is a much tougher challenge.

see 4 comments

Process, Fast and CEF Switching and Packet Punting

I’m probably flogging a fossilized skeleton of a long-dead horse, but it seems I never wrote about this topic before, so here it is (and you might want to read this book for more details).

Process switching is the oldest, simplest and slowest packet forwarding mechanism. Packets received on an interface trigger an interrupt, the interrupt handler identifies the layer-3 protocol based on layer-2 packet headers (example: Ethertype in Ethernet packets) and queues the packets to (user mode) packet forwarding processes (IP Input and IPv6 Input processes in Cisco IOS).

read more see 3 comments

The Saga of Oversubscriptions

Matt Thompson provided a really good answer to the “what’s acceptable oversubscription ratio in a ToR switch” when he wrote “I’m expecting a ‘how long is a piece of string’ answer” (note: do watch the BBC video answering that one).

There’s the 3:1 rule-of-thumb recipe, with a more realistic answer being “it depends”. Now let’s see if we can go beyond that without a deep dive into scholastic waters.

read more see 2 comments

Internet-in-a-VRF and LFIB Explosion

Matthew Stone encountered another unintended consequence of full Internet routing in a VRF design: the TCAM on his 6500 was 80% utilized even though he has the new Sup modules with one million IPv4 routes.

A closer look revealed the first clue: L3 forwarding resources on a Cat6500 are shared between IPv4 routes and MPLS labels (I don’t know about you, but I was not aware of that), and half the entries were consumed by MPLS labels:

read more see 10 comments

Intra-Spine Links in Leaf-and-Spine Fabrics

I had an interesting conversation with Doug Hanks (@douglashanksjr) about the need for intra-spine links in leaf-and-spine fabric designs. You clearly don’t need links between spine switches when every leaf node (switch or router/firewall/load balancer) is connected to all spine switches ... but what happens when one of the leaf-to-spine links fails? Will other leaf switches know that they have to avoid the spine switch with the failed link?

read more see 4 comments

Nexus 6000 and 40GE – why do I care?

Cisco launched two new data center switches on Monday: Nexus 6001, a 1RU ToR switch with the exact same port configuration as any other ToR switch on the market (48 x 10GE, 4 x 40GE usable as 16 x 10GE) and Nexus 6004, a monster spine switch with 96 40GE ports (it has the same bandwidth as Arista’s 7508 in a 4RU form factor and three times as many 40GE ports as Dell Force10 Z9000).

Apart from slightly higher port density, Nexus 6001 looks almost like Nexus 5548 (which has 48 10GE ports) or Nexus 3064X. So where’s the beef?

read more see 20 comments

SDN, Windows and Fruity Alternatives

Brad Hedlund made a pretty valid comment to my “NEC Launched a Virtual OpenFlow Switch blog post: “On the other hand, it's NEC end-to-end or no dice”, implicating the ultimate vendor lock-in.

Of course he’s right and while, as Bob Plankers explains, you can never escape some lock-in (part 1, response from Greg Ferro, part 2 – all definitely worth reading), you do have to ask yourself “am I looking for Windows or Mac?

read more see 3 comments
Sidebar