Nexus 6000 and 40GE – why do I care?
Cisco launched two new data center switches on Monday: Nexus 6001, a 1RU ToR switch with the exact same port configuration as any other ToR switch on the market (48 x 10GE, 4 x 40GE usable as 16 x 10GE) and Nexus 6004, a monster spine switch with 96 40GE ports (it has the same bandwidth as Arista’s 7508 in a 4RU form factor and three times as many 40GE ports as Dell Force10 Z9000).
Apart from slightly higher port density, Nexus 6001 looks almost like Nexus 5548 (which has 48 10GE ports) or Nexus 3064X. So where’s the beef?
For starters, Nexus 6001 supports FCoE and numerous other goodies that Nexus 3064X doesn’t (FabricPath, Adapter- and VM-FEX). If you plan to build an end-to-end FCoE network, you just might want to have 40GE links in the core, but even if you’re past the Fiber Channel stage of data center network evolution and use iSCSI or NFS, it won’t hurt to have full DCB support throughout your network.
Both switches in the Nexus 6000 family have an interesting forwarding table concept: 256K entries are shared between MAC, ARP, ND and (*,G) entries. By default, the MAC address table has 128K entries, with the other 128K being available for L3 entries, but you can move the boundary if you have to. Oh, BTW, IPv6 entries use twice as much host table space as IPv4 entries (not surprisingly), for a maximum of ~50K dual-stack entries.
However, 40GE support is the most important part of the new launch for me. While 40GE uses the same four multimode fiber pairs like four 10GE links, and has slightly higher bandwidth density (because a single 40GE connector uses less space than four 10GE connectors), the real difference lies in the way you can use the four pairs. You have to combine four 10GE links into a port channel (aka Link Aggregation Group) or use ECMP routing to get the same bandwidth as a single 40GE link ... but even then you don’t get the same bandwidth due to port channel/ECMP load sharing limitations (exception: Brocade VCS Fabric). Having four 40GE uplinks is definitely much better than having 16 10GE uplinks.
Next, if you plan to build large data center fabrics, you must consider the Nexus 6004. You can use it together with almost any 10/40GE ToR switch to build leaf-and-spine Clos fabrics with over 4500 ports without resorting to 10GE breakout cables or multi-stage Clos fabrics ... and since Nexus 6004 supports FabricPath, you could use this fabric in L2 or L3 mode if you use Nexus 6001 as the ToR switch (not that I would ever recommend having large L2 fabrics).
Here’s the math: 1:3 end-to-end oversubscription ratio, four spine switches with 96 40GE ports each, 96 leaf switches with a 40GE uplink to each spine switch. Each leaf switch has 48 10GE ports; 48 x 96 = 4608.
Finally, Cisco announced a new FEX: 2248PQ-10G with 48 10GE ports and four 40GE uplinks. You can use two Nexus 6004 switches and the new fabric extenders to build a ~2300-port fabric with just two managed nodes (as long as you don’t care that the total switching capacity of the fabric is “only” around 15 Tbps because all switching is done in the spine switches).
Here’s the math: With 1:3 oversubscription ratio, each FEX has two 40GE uplinks to a pair of 6004 switches, resulting in a maximum of 48 x 2248PQ-10G, each with 48 10GE ports. 48 x 48 = 2304. Might be that the actual number is lower due to FEX scalability limitations – it will be a while before NX-OS Verified Scalability Guide will be published for Nexus 6000.
More details
Clos fabric architectures are described in (no surprise there) Clos Fabrics Explained webinar; you might also watch the Data Center 3.0 for networking engineers webinar to learn more about other data center technologies. Finally, you might want to register for May Data Center Fabrics update session (which will definitely include the Nexus 6000 switches).
Is this not the case on any equipment or just the 6000?
Also, s/strands/pairs/g
What are your thoughts on what an acceptable over-subscription ratio is? OK, I know that's a loaded question and am half expecting a 'how long is a piece of string' answer. Is it as 'simple' as, more east\west=lower ratio required, more north\south=higher ratio acceptable?
http://blog.ioshints.info/2013/02/the-saga-of-oversubscriptions.html
;-)
By the way, I think I got my points the wrong way around above:
Is it as 'simple' as, more east\west=lower ratio required, more north\south=higher ratio acceptable?
That should be:
Is it as 'simple' as, more east\west=higher ratio acceptable, more north\south=lower ratio required?
Let me drink this coffee and determine if that IS what I actually mean!
Hello Ivan and thanks for your very interesting post. Cisco tell me that the number a FEX will be still limited on the 6004. For reminder, on the Nexus 5k it is limited to 24 FEX. On the 6004, they talk about 32, but it isn't sure. Are you sure about the maximum number of FEX wich can be connected to the NExus 6004 (48) ?
Actually, the number of ports given in a Cisco Live presentation indicates 24 or 32 might be the right answer.
Thanks for this quick feedback.
May I ask a question about a single 40G link vs 4 Х 10G LAG.
Let's suppose we have two datacenters (DC) in a campus, connected via two independent optic fibers. Each DC has its own Nexus 6001. Suppose that we want to connect them via those two fibers and don't want to spend more than a single 40G port on each Nexus. (Of course we want to make this connection highly available - otherwise, why bother running two redundant fibres between the sites.)
The easy way to accomplish this task is, obviously, the following: configure the 40G port as 4 X 10G ports, combine them in a LAG, and use fibre 1 for two 10G links of the LAG and fibre 2 for the two remaining 10G links.
Now the interesting point: performance-wise, a 40G single link mode is said to be better than 4 X 10G mode. Does it have any kind of internal redundancy? Can we split it via the breakout cable and route via two different fibres to the likewise 40G port on a remote 6001? If yes, then will this link continue working if one of the two fibres gets broken?