Building Network Automation Solutions
6 week online course starting in September 2017

Do We Need QoS in the Data Center?

Whenever I get asked about QoS in the data center, my stock reply is “bandwidth is cheaper than QoS-induced complexity.” This is definitely true in most cases, and ideally the elephant problems should be solved higher up in the application stack, not with network-layer kludges, but are there situations where you actually need data center QoS?

Scaling Overlay Networks: Scale-Out Control Plane

A week or so ago I described why a properly implemented hypervisor-based overlay virtual networking data plane is not a scalability challenge; even though the performance might decrease slightly as the total number of forwarding entries grow, modern implementations easily saturate 10GE server uplinks.

Scalability of the central controller or orchestration system is a totally different can of worms. As I explained in the Scaling Overlay Networks, the only approach that avoids single failure domain and guarantees scalability is scale-out control plane architecture.

Response: We Dislike People Quoting Gartner

John Herbert wrote a wonderful post explaining why he (and a lot of other people including myself) hates seeing Gartner quotes in vendor presentations. Let me elaborate a bit on this apparent anti-Gartner sentiment.

Lisa Caywood from Brocade is an exception – watch the video of her Network Field Day 9 presentation to see how a vendor Gartner slide should look like.

Let’s Get Rid of the Thick Yellow Cable

Whenever I write about the crazy things vendors are trying to sell us, and the kludges we have to live with, I keep wondering, “Is it just me, or is the whole industry really as ridiculous as it seems?” It’s so nice to see someone else coming to the same conclusions, like Mark Burgess (the author of CFEngine and the Promise Theory) did in a lengthy essay on whether SDN makes sense.

BGP Configuration Made Simple with Cumulus Linux

BGP is without doubt the most scalable routing protocol, which made it a popular choice for large-scale deployments from service provider networks to enterprise WAN/VPN networks and even data centers. Its only significant drawback is the tedious configuration process (which almost reminds me of writing COBOL programs decades ago).

RFC 7454: BGP Operations and Security

After almost exactly three years of struggles our BGP Operations and Security draft became RFC 7454 – a cluebat (as Gert Doering put it) you can use on your customers and peers to help them fix their BGP setup.

Without Jerome Durand this document would probably remain forever stuck in the draft phase. It’s amazing how many hurdles one has to jump over to get something published within IETF. Thanks a million Jerome, you did a fantastic job!

Performance of Hypervisor-Based Overlay Virtual Networking

Years ago I managed to saturate a 10GE uplink on a vSphere server I tested with a single Linux VM using less than one vCPU. On the other hand, squeezing 1 Gbps out of Open vSwitch using GRE encapsulation was called ludicrous speed not so long ago. Implementing overlay virtual networking in the hypervisor obviously carries a huge performance penalty, right? Not so fast…

Update: Performance of Hash Table Lookups

In the Myths That Refuse to Die: Scalability of Overlay Virtual Networking blog post I wrote “number of MAC addresses has absolutely no impact on the forwarding performance until the MAC hash table overflows”, which happens to be almost true.

Hands-On Tail-F Experience – Part 2

Want to know even more about Tail-F NCS after listening to Episode 22 of Software Gone Wild? Boštjan Šuštar and Marko Tišler from NIL Data Communications continue their deep dive into the secrets of NCS in Software Gone Wild Episode 23.

Myths That Refuse to Die: Scalability of Overlay Virtual Networking

If you watched the Network Field Day videos, you might have noticed an interesting (somewhat one-sided) argument I had with Sunay Tripathi, CTO and co-founder of Pluribus Networks (start watching at around 32:00 to get the context). Let’s try to get the record straight.

Must bookmark: NSX Link-o-Rama

Brad Hedlund sent me a link to a fantastic list of NSX resources, from design and troubleshooting guides to videos and blog posts. A must-bookmark page if you're even remotely interested in VMware NSX.

Networking Field Day 9: Brief Recap

I’m sitting in the San Francisco airport with nothing better to do than writing blog posts, so let’s see what we’ve seen and learned during the Networking Field Day 9.

Most videos recorded during the week are already online. You’ll find links to them in the Presentation Calendar section.

Hands-On Tail-F Experience on Software Gone Wild

Tail-F NCS implements one of the most realistic approaches to service abstraction (the cornerstone of SDN – at least in my humble opinion) – an orchestration system that automates service provisioning on existing infrastructure.

Is the product really as good as everyone claims? How hard is it to use? How steep is the learning curve? Boštjan Šuštar and Marko Tišler from NIL Data Communications have months of hands-on experience and were willing to share it in Episode 22 of Software Gone Wild.

Per-Packet Load Balancing on WAN links

One of my readers got an interesting idea: he’s trying to make the most of his WAN links by doing per-packet load balancing between a 30 Mbps and a 50 Mbps link. Not exactly surprisingly, the results are not what he expected.

ONS Accelerate Workshop: Amazingly Refreshing

Sometimes the stars do align: Open Networking Summit organized their Service Provider Accelerate Workshop just a day prior to Network Field Day, so I had the fantastic opportunity to attend both.

I didn’t know what to expect from an event full of SDN/NFV thought leaders, and was extremely pleasantly surprised by the amount of realistic down-to-earth information I got.

Whitebox Switching and Industry Pundits

Industry press, networking blogs, vendor marketing whitepapers and analyst reports are full of grandiose claims of benefits of whitebox switching and hardware disaggregation. Do you ever wonder whether these people actually practice their theories?

Scaling Overlay Networks: Distributed Data Plane

Thou Shalt Have No Chokepoints” is one of those simple scalability rules that are pretty hard to implement in real-life products. In the Distributed Data Plane part of Scaling Overlay Networks webinar I listed data plane components that can be easily distributed (layer-2 and layer-3 switching), some that are harder to implement but still doable (firewalling) and a few that are close to mission-impossible (NAT and load balancing).

Let’s Meet in Zurich or Heidelberg

I’ll be speaking at two conferences in March: SDN event in Zurich organized by fantastic Gabi Gerber, and the best boutique security conference – Troopers 15 in Heidelberg. If you’ll be attending one of these events, just grab me, drag me to the nearest coffee table, and throw some interesting questions my way ;) … and if you happen to be near one of these locations, let me know and we might figure out how to meet somewhere.

Whiteboarding Cisco ACI on Software Gone Wild

Late last year David Gee and I wanted to test another interesting gizmo: an online virtual whiteboard. David was pondering some interesting aspect of Cisco ACI and they seemed like a perfect topic for an impromptu discussion.

We Need to Move from Assembling Car Parts to Driving Cars

During a great conversation I had with Terry Slattery during Interop New York, he said “well, I don’t think anyone should be configuring VLANs and asking ÔÇśHow to configure a VLAN on a switch’ – we should be focused on providing end-to-end connectivity”, and there’s absolutely nothing in that statement that one could disagree with.

Combining MPLS/VPN, MPLS-TE and QoS on MPLS Talks

In the final part of our MPLS-focused discussion, Seamus wanted to know how one could combine MPLS/VPN, MPLS-TE and QoS (for example, sending VoIP traffic for one customer over a different path).

Short answer: don’t even think about doing that. The added complexity is not worth whatever extra money you’ll be charging the customer (or not).

Before Talking about vMotion across Continents, Read This

I expect to hear a lot about the “wonderful” idea of moving running VMs 100 msec away (across the continent) in the upcoming weeks. I would recommend you read a few of my older blog posts before considering it… and don’t waste time trying to persuade the true believers with technical arguments – talk with whoever will foot the bill or walk away.

Big Cloud Fabric: Scaling OpenFlow Fabric

I’m still convinced that architectures with centralized control planes (and that includes solutions relying on OpenFlow controllers) cannot scale. On the other hand, Big Switch Networks is shipping Big Cloud Fabric, and they claim they solved the problem. Obviously I wanted to figure out what’s going on and Andy Shaw and Rob Sherwood were kind enough to explain the interesting details of their solution.

Long story short: Big Switch Networks significantly extended OpenFlow.

Last Chapter of Data Center Design Case Studies Is Published

A few days ago I completed the last chapter in the Data Center Design Case Studies book: building disaster recovery and active-active data centers. It focuses on application behavior and business needs, not on the underlying technologies; the networking technology part tends to be way easier to solve than the oft-ignored application-level challenges.