The following design challenge landed in my Inbox not too long ago:
My organization is the in the process of building a completely new data center from the ground up (new hardware, software, protocols ...). We will currently start with one site but may move to two for DR purposes. What DC technologies should we be looking at implementing to build a stable infrastructure that will scale and support technologies you feel will play a big role in the future?
Application and server recommendations
Whatever you do, make sure you use scale-out application architecture as much as possible. Use products and tools that allow you to scale out every application tier (web servers, application servers and database servers). Web servers are usually easy to scale out unless you insist on weird session management techniques. Database servers are the toughest nut to crack, but even Microsoft’s SQL server has a somewhat redundant architecture.
If you want to make scale-out architecture transparent to the clients, you have to deploy load balancing. Use local load balancing within a data center and DNS-based load balancing between data centers (you might also try out anycast). Select products that have tight integration between local and DNS-based load balancing. Prefer vendors that integrate tightly with your server virtualization platform (example: new VMs should be added to load balancing pools automatically).
Use as much server virtualization as possible. Unless you have huge workloads where a single application needs several high-end physical servers for every tier, virtualization will significantly lower your costs and help you deploy new servers and applications faster.
Use high-end physical servers with as much memory and as many CPU cores as your budget can survive. Bob Plankers wrote a nice blog post explaining why scale-up makes sense for hypervisor hosts.
Support IPv4 and IPv6. Ideally you’d deploy only IPv6 on the inside networks assuming your applications can work over IPv6 (dual-stack deployment increases complexity and support costs) and do 4-to-6 and 6-to-6 load balancing. Some vendors might lack IPv6 support in their data center gear. It’s their problem, don’t make it yours.
Simplify your external routing and try to get a different public prefix for each site. Getting two public IPv4 prefixes might be a tough call; supposedly there’s plenty of IPv6 address space left judging by how we’re throwing it away.
Keep your layer-2 domains small and use layer-3 switching (known inside the ivory towers as routing) as much as possible. Data centers should be islands of layer-2 connectivity in an ocean of layer-3 (thank you, @networkingnerd). Even with emerging earth-flattening technologies like FabricPath, TRILL or SPB, bridging still doesn’t scale (spanning tree protocol is not the only problem bridging has).
If you’ve listened to the previous advices, you don’t need large-scale bridging anyway – scale-out application architectures with load balancers work happily across multiple IP subnets (so do some recent clustering solutions) and don’t need large VLANs.
Furthermore, with proper application architecture and decent load balancing products, there’s no need to move virtual machines around. You can easily shut them down in one location and start them in another where they would get a different IP address; the load balancing tools (integrated with your virtualization platform) should automatically adapt to the changes you’ve made.
Some vendors might not have L3 switching available in products that would fit your needs. Remember that it’s their problem, not yours. Look around, there are alternatives.
Make your layer-2 domains stable. Use multi-chassis link aggregation to increase bandwidth utilization and reduce the impact of link failures. Use spanning tree protection features offered by your gear (BPDU guard, root guard, bridge assurance ...).
Use 10-gigabit Ethernet. It’s easier to maintain than tons of 1GbE links and might actually get decent utilization when used on high-end servers mentioned in the previous paragraph. I would try to use a system that supports virtual Ethernet NICs (Cisco UCS comes to mind). VMware is still unhappy if you don’t have plenty of NICs in your server; the easiest way to keep it happy is to use plenty of virtual NICs spread over two physical uplinks.
Data centers used to have a hierarchy of bandwidths – 100 Mbps to the servers, 1Gb in access layer, 10Gb in the core. With 10GbE server attachments and 40/100GbE products still not widely available (or being too expensive), you’re forced to use high oversubscription ratios. Port channels help, but they’re not perfect. Select the gear that supports DCB standards to cope with high-volume servers overloading the core links. PFC and ETS are mandatory; QCN is not needed if you don’t have large L2 domains.
Don’t even think about L2 data center interconnect. Yet again, if you did follow my advice and implemented load balancing and scale-out architecture, you don’t need L2 DCI. IP has been proven to work just fine between sites and there’s no reason you should try to reinvent the wheel that has been demonstrated to be broken 20 years ago.
Build your Data Center Interconnects (DCI) with MPLS. Deploying MPLS does require a new set of skills, so it might go against keep it simple recommendation, but it gives you flexibility. You can deploy IP routing, layer-3 VPNs (to keep security zones separated across DCI link) or layer-2 VPNs (either VPLS or upcoming MAC VPN from Juniper) across MPLS infrastructure as needed.
Security, Logging, Monitoring
Consider monitoring and security in the design and build phases. If feasible, use separate cabling for out-of-band management and monitoring, including console access (use terminal servers for remote console access). High-end devices have dedicated management ports; use them! At the very minimum, dedicate a VLAN and a L3 subnet exclusively for network management purposes.
Don't forget to consider physical security and system security. Deployment of IP cameras and recording equipment are worthwhile.
Log everything (and use NTP to synchronize clocks). Make sure you consider a Logging & Compliance Management system that collects logs from everything – including Windows, Unix, Storage, Firewalls and Mainframe – and then analyze them. Learn how to use a Security Event Manager to collate these logs.
Use as much stateless firewalling as possible. Having a stateful firewall that does nothing else but permit TCP sessions to port 80 (HTTP) is a waste.
Your firewalling needs also depend on your applications. If you use applications with clean HTTP-based architecture, you’ll do just fine with packet filters. If the application uses RPC calls using dynamic server port numbers, you’re in troubles.
If you must use firewalls, use products that support multiple logical (virtual) firewalls in a single chassis. As your data center changes, you will be able to create new logical firewall instances instead of buying new hardware devices.
Use Web Application Firewalls. Traditional firewalls and IDS/IPS devices cannot protect you against majority of today’s threats – application-level intrusions like SQL injections. If you want to protect your web applications, you need a device that can reassemble HTTP requests, do a deep inspection and reject anything that looks suspicious.
Use IP everywhere. Unless you have legacy Fiber Channel gear or huge storage requirements where the FC management tools would make sense, go with iSCSI or NFS. Choose storage devices that support both. Use a separate VLAN for storage traffic; you might want to build a dedicated network to handle it. Use a dedicated 802.1p priority class for iSCSI/NFS traffic and PFC for lossless transport (lossless transport significantly improves performance of high-volume TCP traffic like iSCSI or NFS).
Use Small Storage Arrays (by Greg Ferro). The storage industry has a lot of new technology coming and storage technology is finally changing more quickly. I would advise buying the smallest sized storage arrays you feel will work and then plan for regular hardware updates or new systems. The shift from 3.5" FCAL drives to SAS 2.5" drives at 7200 RPM means less power and better performance, the impact of SSD drives is only just being delivered in new storage products, and software developments for deduplication, block recovery are moving from the high end into standard products.
Physical infrastructure (by Greg Ferro)
Install only the cabling you need. The transition from 1GbE to 10GbE, 40 GbE and 100GbE means big changes for cabling. 10GbE over multimode needs one OM3 fiber pair, 40GbE needs 8 cores, 100GbE needs twenty cores. The MPO connector can only be assembled in factory. Therefore, running twenty or hundred core cabling is a waste of time.
Discussion continues over the use of OM4 or even OM5 multimode for future Ethernet standards, but most likely single mode will be more common. All this uncertainty means you should install only the minimum cabling you really need and plan to use modular cabling systems into the future.
Go Bare Floor. The days of the raised flooring are over. The weight of a server rack with four chassis or two large switches installed usually means reinforced flooring which wastes time and space. Investigate cooling designs (Yahoo, Facebook/OpenCompute) that allow for direct floor use such as hot/cold aisles and air flow containment and overhead cabling trays for power and cabling.
The final advice
Last, but definitely not least, whatever you do – keep it simple. Choose the technologies that your team can support (or make sure they get properly trained). You will want to take a vacation at least once in the next decade and the guy that gets the support call at 1AM has to be able to solve the problem on his own without waking you up.
This document has been reviewed and greatly improved by (in alphabetical order) Dan Hughes (@rovingengineer), Greg Ferro (@etherealmind), Jeremy Filliben, Kurt Bales (@networkjanitor), Matthew Norwood (@matthewnorwood) and Tom Hollingsworth (@networkingnerd):
- Kurt Bales suggested using IPv6 on the inside networks (a strategy also favored by Tore Anderson), MPLS on the DCI infrastructure and WAFs;
- Matthew Norwood suggested out-of-band monitoring & management;
- Greg Ferro made extensive remarks on cabling issues (I’ve added a few links to his excellent blog posts) and flooring, recommended virtual firewalls and small storage arrays, and mentioned logging requirements;
- Jeremy Filliben mentioned 10GbE oversubscription problems (causing me to include DCB as one of the recommendations);
- Tom Hollingsworth made the fantastic islands-in-the-ocean analogy;
- I had a great discussion with Dan Hughes regarding external routing. This topic deserves a blog post of its own.
Thank you all for very quick and thoughtful responses!
Even more information
You’ll find big-picture perspective as well as in-depth discussions of various data center technologies in my webinars: Data Center 3.0 for Networking Engineers (recording), Data Center Interconnects (recording) and VMware Networking Deep Dive (recording or live session). All three webinars are also available as part of the yearly subscription.