Build the Next-Generation Data Center
6 week online course starting in spring 2017

NAPALM Update on Software Gone Wild

We did a podcast describing NAPALM, an open-source multi-vendor abstraction library, a while ago, and as the project made significant progress in the meantime, it was time for a short update.

NAPALM started as a library that abstracted the intricacies of network device configuration management. Initially it supported configuration replace and merge; in the meantime, they added support for diffs and rollbacks

To API or Not To API

One of my readers left this comment (slightly rephrased) on my Network Automation RFP Requirements blog post:

Given that we look up to our *nix pioneers as standard bearers for system automation, why do we demand an API from network devices? The API requirement would make sense if the vendor OS is a closed system. If an open system vendor creates APIs for applications running on their system (say for BGP configs) - kudos to them, but I no longer think that should be mandated.

He’s right - API is not a mandatory prerequisite for reliable network automation.

Do Enterprises Need MPLS?

Continuing the Do Enterprises Need VRFs discussion, let’s see which enterprise networks might need MPLS.

Do you need VRFs?

Read the previous blog post. If the answer is NO, you can stop reading. Otherwise, carry on.

New Webinar: Networks, Buffers and Drops

Do you need large buffers in data center switches or not? If you’re a vendor your take obviously depends on whether you have them or not, and then there are people saying “it’s bullshit” (mostly agree) and “look, I have a shinier toy” (get lost).

Unfortunately, it’s really hard to get someone who would know what he’s talking about, and be relatively unbiased.

The Network Is Reliable and Other Stories

I was cleaning my Blog Post Ideas Evernote notebook and found these gems hidden deep inside its bowels:

I still haven’t found the presentation in which someone (from Facebook?) explained how long DNS information with long-expired TTLs persists in the clients. Relevant links would be highly appreciated.

Why cybersecurity certifications suck

Robert Graham wrote a great blog post explaining why so many IT certifications suck.

TL&DR: because they are trivial pursuits instead of knowledge assessment tests… but do read the whole post and compare it to your recent certification experience.

Basic Docker Networking

After explaining the basics of Linux containers, Dinesh Dutt moved on to the basics of Docker networking, starting with an in-depth explanation of how a container communicates with other containers on the same host, with containers residing on other hosts, and the outside world.

Do You Use SSL between Load Balancers and Servers?

One of my readers sent me this question:

Using SSL over the Internet is a must when dealing with sensitive data. What about SSL between data center components (frontend load-balancers and backend web servers for example)? Does it make sense to you? Can the question be summarized as "do I trust my Datacenter network team"? Or is there more at stake?

In the ideal world in which you’d have a totally reliable transport infrastructure the answer would be “There’s no need for SSL across that infrastructure”.

Do Enterprises Need VRFs?

One of my readers sent me a long of questions titled “Do enterprise customers REALLY need VRFs?

The only answer I could give is “it depends” (it’s like asking “Do animals need wings?”), and here’s my attempt at building a decision tree:

You can use the decision tree to figure out whether you need VRFs in your data center or in your enterprise WAN.

Save the date: Leaf-and-Spine Fabric Design Workshop in Zurich

Do you believe in vendor-supplied black box (regardless of whether you call it ACI or SDDC) or in building your own data center fabric using solid design principles?

It should be an easy choice if believe a business should control its own destiny instead of being pulled around by vendor marketing (to paraphrase Russ White)

One of the better explanations of SDN

Stumbled upon this via HighScalability:

Every time I feel like I'm "out of touch" with the hip new thing, I take a weekend to look into it. I tend to discover that the core principles are the same [...]; or you can tell they didn't learn from the previous solution and this new one misses the mark, but it'll be three years before anyone notices (because those with experience probably aren't touching it yet, and those without experience will discover the shortcomings in time.)

Yep, that explains the whole centralized control plane ruckus ;) Read also a similar musing by Ethan Banks.

Fast Linux Packet Forwarding with Thomas Graf on Software Gone Wild

We did several podcasts describing how one could get stellar packet forwarding performance on x86 servers reimplementing the whole forwarding stack outside of kernel (Snabb Switch) or bypassing the Linux kernel and moving the packet processing into userspace (PF_Ring).

Now let’s see if it’s possible to improve the Linux kernel forwarding performance. Thomas Graf, one of the authors of Cilium claims it can be done and explained the intricate details in Episode 64 of Software Gone Wild.

Network Automation RFP Requirements

After finishing the network automation part of a recent SDN workshop I told the attendees “Vote with your wallet. If your current vendor doesn’t support the network automation functionality you need, move on.

Not surprisingly, the next question was “And what shall we ask for?” Here’s a short list of ideas, please add yours in comments.

Do I Need Redundant Firewalls?

One of my readers sent me this question:

I often see designs involving several more than 2 DCs spread over different locations. I was actually wondering if that makes sense to bring high availability inside the DC while there's redundancy in place between the DCs. For example, is there a good reason to put a cluster of firewalls in a DC, when it is possible to quickly fail over to another available DC, as a redundant cluster increases costs, licenses and complexity.

Rule#1 of good engineering: Know Your Problem ;) In this particular case:

Check Out the Designing Active-Active and Disaster Recovery Data Centers Webinar

The featured webinar in October 2016 is the Designing Active-Active and Disaster Recovery Data Centers webinar, and the featured videos include the discussion of disaster avoidance challenges and the caveats you might encounter with long-distance vMotion. All subscribers can view these videos, if you’re not one of them yet start with the trial subscription.

As a trial subscriber you can also use this month's featured webinar discount to purchase the webinar.

The Impact of ICMP Redirects

One of my readers sent me an interesting question after reading my ICMP Redirects blog post:

In Cisco IOS, when a packet is marked by IOS for ICMP redirect to a better gateway, that packet is being punted to the CPU, right?

It depends on the platform, but it’s going to hurt no matter what.

Optimize Your Data Center: Virtual Appliances

We got pretty far in our Data Center optimization journey. We virtualized the workloadgot rid of legacy technologies, and reduced the number of server uplinks and replaced storage arrays with distributed file system.

Final step on the journey: replace physical firewalls and load balancers with virtual appliances.

Using DNS Names in Firewall Rulesets

My friend Matthias Luft sent me an interesting tweet a while ago:

All I could say in 160 characters was “it depends”. Here’s a longer answer.

Worth Reading on Network Guru

Just wanted to point you to two excellent blog posts recently published by Russ White.

Reaction: DevOps and Dumpster Fires

If teaching coders isn’t going to solve the problem, then what do we do? We need to go to where the money is. Applications aren’t bought by coders, just like networks aren’t.

Ansible versus Puppet in Initial Device Provisioning

One of the attendees of my Building Next-Generation Data Center course asked this interesting question after listening to my description of differences between Chet/Puppet and Ansible:

For Zero-Touch Provisioning to work, an agent gets installed on the box as a boot up process that would contact the master indicating the box is up and install necessary configuration. How does this work with agent-less approach such as Ansible?

Here’s the first glitch: many network devices don’t ship with Puppet or Chef agent; you have to install it during the provisioning process.

Use VRFs to Solve Routing-on-Hosts Challenges

One of my readers sent me interesting feedback after reading my explanation of why I’d try not to use OSPF as a routing protocol between hosts and ToR switches. He said:

Unfortunately we can’t use BGP because IBM mainframes support only OSPF or RIP, so we decided to use VRFs instead.

Here’s what they did:

Survey on IXP Routing and Privacy

Marco Canini from UC Louvain is working on an IXP research project focused on bringing privacy guarantees into Internet routing context. They’re trying to understand the privacy considerations of network operators and have created a short survey to gather the initial data.

Researchers from UC Louvain have been involved in tons of really useful projects including BGP PIC, LFA, MP-TCP, Fibbing, Software-defined IXP and flow-based load balancing, so if you’re connected to an IXP, please take your time and fill in the survey.