Your browser failed to load CSS style sheets. Your browser or web proxy might not support elliptic-curve TLS

Building network automation solutions

9 module online course

Start now!

How I Started Hating Automatic Context Switching in Cisco IOS

Here’s a trick question:

To implement this request you use the following configuration commands (plenty of other commands removed because they don’t impact the results):

router bgp 64500
 address-family ipv4
  maximum-paths ibgp 32
  maximum-paths 32
  neighbor next-hop-self
  neighbor next-hop-self
 address-family vpnv4
  maximum-paths ibgp 32
  maximum-paths 32
  no neighbor next-hop-self
  no neighbor next-hop-self

Try to figure out what the end-result will be without connecting to a router or reading the rest of this blog post.

Ok, here’s what totally threw me off (and wasted an hour of my life): next-hop-self is removed from neighbors in the IPv4 address family. Here’s why:

  • There is no maximum-paths ibgp command in VPNv4 address family;
  • The moment you enter maximum-paths ibgp command the configuration parser exits the address-family vpnv4 context and enters router bgp context;
  • Because the ipv4 address family is the default context within router bgp (for legacy reasons) all the subsequent commands are executed within the address-family ipv4 context removing next-hop self from neighbors in IPv4 address family.

No wonder David Barroso named his library NAPALM (you’ll find the full story in this or this podcast).

see 6 comments

Worth Investigating: My Looking Glass Tool

If you're a networking engineer, sysadmin, or NetDevOps guru preferring the power of CLI over carpal-syndrome-inducing GUI you might like the My Looking Glass tool developed by Mehrdad Arshad Rad. Haven't tried it out, but the intro on GitHub page looks promising.

If you decide to try it out (or already did) please share your experience in a comment. Thank you!

see 3 comments

Push Configuration Snippet to a Bunch of Cisco IOS Devices

As I was trying to automate configuration deployment in a multi-router Cisco IOS lab, I got to a point where the only way of figuring out what was going on was to log commands on Cisco IOS devices. Not a big deal, but I hate logging into a dozen boxes and configuring the same few lines on all of them (or removing them afterwards).

Time for another playbook: this one can push one of many (configurable) configuration snippets to a group of Cisco IOS devices defined in an Ansible inventory file.

Interesting? Want to do something more complex? Join the Network Automation online course.

Add comment

Generating OSPF, BGP and MPLS/VPN Configurations from Network Data Model

Over a month ago I decided to create a lab network to figure out how to solve an interesting Inter-AS MPLS/VPN routing challenge. Instead of configuring half a dozen routers I decided to develop a fully-automated deployment because it will make my life easier.

I finally got to a point where OSPF, LDP, BGP (IPv4 and VPNv4) and MPLS/VPN configurations are created, deployed and verified automatically.

read more see 6 comments

Create Ansible Inventory File from Vagrant SSH Configuration

While it’s relatively easy to create an Ansible inventory file to support a Vagrant-created virtual networking lab, it’s also utterly boring – a perfect job for an automation script. I’m positive there are a zillion solutions out there, but I decided to reinvent the wheel and get a bit of Python hands-on practice.

Add comment

Network Automation Labs with Ansible in a Virtual Machine

Most network automation tutorials out there assume you’re running Ansible on your workstation and accessing virtual machines via SSH ports mapped by Vagrant. That’s great if you’re an experienced Ansible/Python user; for a clunky beginner like myself it’s safer to run Ansible within a VM that can be destroyed and recreated in seconds.

Add comment

Video: SDN Controller Running in a Virtual Machine

During the Monitoring Software-Defined Networks webinar Terry Slattery addressed an interesting question: what happens if you run an SDN controller in a virtual machine and the virtual network crashes?

Want to know his take on the problem? Watch this free video on content site.

see 1 comments

Q&A: Vendor OpenFlow Limitations

I rarely get OpenFlow questions these days; here’s one I got not so long ago:

I've just spent the last 2 days of my life consuming the ONF 1.3.3 white paper in addition to the $vendor SDN guide to try and reconcile what features it does or does not support and have come away disappointed...

You’re not the only one ;)

read more see 10 comments

Worth Reading: Load Balancing at Fastly

High-speed scale-out load balancing is a Mission Impossible. You can get the correct abstraction at the wrong cost or another layer of indirection (to paraphrase the authors of Fastly load balancing solution).

However, once every third blue moon you might get a team of smart engineers focused on optimal solutions to real-life problems. The result: a layer of misdirection, a combination of hardware ECMP and server-level traffic redirection. Enjoy!

Add comment

Network Automation Tools: Featured Webinar in December 2016

The featured webinar in December 2016 is the Network Automation Tools webinar, and in the featured videos you'll find in-depth description of automation frameworks (focusing on Ansible) and open-source IPAM tools (including NSoT recently released by Dropbox).

To view the videos, log into, select the webinar from the first page, and watch the videos marked with star.

read more Add comment

Snabb Switch with vMX Control Plane on Software Gone Wild

In Software Gone Wild Episode 52 Katerina Barone-Adesi explained how Igalia implemented 4-over-6 tunnel termination (lwAFTR) with Snabb Switch. Their solution focused on very fast data plane and had no real control plane.

No problem – there are plenty of stable control planes on the market, all we need is some glue.

read more Add comment

Q&A: Building a Layer-2 Data Center Fabric in 2016

One of my readers designing a new data center fabric that has to provide L2 transport across the data center sent me this observation:

While we don’t have plans to seek an open solution in our DC we are considering ACI or VXLAN with EVPN. Our systems integrator partner expressed a view that VXLAN is still very new. Would you share that view?

Assuming he wants to stay with Cisco, what are the other options?

read more see 18 comments

Response: On the Death of OpenFlow

On November 7th SDx Central published an article saying “OpenFlow is virtually dead.” There’s a first time for everything, and it’s a real fun reading a marketing blurb on a site sponsored by SDN vendors claiming the shiny SDN parade unicorn is dead.

On a more serious note, Tom Hollingsworth wrote a blog post in which he effectively said “OpenFlow is just a tool. Can we please find the right problem for it?

read more see 6 comments

Network Automation Online Course: a Vendor Perspective

A few days after I published the blog post describing why it might make sense to attend the Building Network Automation Solutions course even when you’re already using a $vendor network management system/platform, I got a surprising email from one of my friends working for a major networking vendor:

read more Add comment

Building a L3-Only Data Center with Cumulus Linux

Dinesh Dutt was the guest speaker in the second Leaf-and-Spine Fabric Design session. After I explained how you can use ARP/ND information to build a layer-3-only data center fabric that still support IP address mobility Dinesh described the details of Cumulus Linux redistribute ARP functionality and demoed how it works in a live data center.

see 3 comments

Finding Excuses to Avoid Network Automation

My Network Automation in Enterprise Environments blog post generated the expected responses, including:

Some of the environments I am looking at have around 2000-3000 devices and 6-7 vendors for various functions and 15-20 different device platform from those vendors. I am trying to understand what all environments can Ansible scale up to and what would be an ideal environment enterprises should be looking at more enterprise grade automation/orchestration platforms while keeping in mind that platform allows extensibility.

Luckily I didn’t have to write a response – one of the readers did an excellent job:

read more see 1 comments

Worth Reading: So You Want to Become a Cloud Provider

My friend Robert Turnšek published an interesting blog post pondering whether it makes sense to become a cloud provider.

I loved reading it, particularly the Trap for System Integrators part, because I know a bit of the history, and could easily identify two or three failed or stalled projects per paragraph (like: “Just adding some blade servers and storage to the existing server environment won’t make you a cloud provider”). Hope you’ll have as much fun as I did.

see 1 comments

Q&A: Ingress Traffic Flow in Multi-Data Center Deployments

One of my readers was watching the Building Active-Active Data Centers webinar and sent me this question:

I'm wondering if you have additional info on how to address the ingress traffic flow issue? The egress is well explained but the ingress issue wasn't as well explained.

There’s a reason for that: there’s no good answer.

read more Add comment

StackStorm 101 on Software Gone Wild

A few weeks ago Matt Oswalt wrote an interesting blog post on principles of automation, and we quickly agreed it’s a nice starting point for a podcast episode.

In the meantime Matt moved to StackStorm team so that became the second focus of our chat… and then we figured out it would be great to bring in Matt Stone (the hero of Episode 13).

read more Add comment

Testing Ansible Playbooks with Cisco VIRL

Cisco VIRL is the ideal testing environment when you want to test your Ansible playbooks with various Cisco network operating systems (IOS, IOS XE, NX-OS or IOS XR). The “only” gotcha: how do you reach those devices from the outside world?

It was always possible to reach the management interface of devices running with VIRL, and it got even simpler with VIRL release 1.2.

see 1 comments

Q&A: Big Switch SDN

Got this set of questions from one of my readers:

I just met up with DELL guys for Big Switch SDN. They claim there is no routing running on leaf switches, the BCF is purely OpenFlow.

Almost true. It is based on OpenFlow, but they use tons of their own OpenFlow extensions to get stuff to work. That’s also why you have to install their agent on the switches.

read more see 2 comments

First Speakers in Building Network Automation Solutions Online Course

Like with the Next-Generation Data Center course, the live sessions in the Building Network Automation Solutions course include guest speakers, Q&A discussions, and solutions to sample challenges that you’ll be able to use to complete your homework assignments.

The guest speakers for the January 2016 course include:

read more Add comment

Video: Docker Networking Options

After introducing the fundamentals of Docker networking, Dinesh Dutt focused on various Docker networking options, including multi-host networking with overlays.

After watching the video, you might also want to listen to Episode 49 of Software Gone Wild with Brent Salisbury, Dave Tucker and Madhu Venugopal.

Add comment

Can VMware NSX and Cisco ACI Interoperate over VXLAN?

I got a long list of VXLAN-related questions from one of my subscribers. It started with an easy one:

Does Cisco ACI use VXLAN inside the fabric or is something else used instead of VXLAN?

ACI uses VXLAN but not in a way that would be (AFAIK) interoperable with any non-Cisco product. While they do use some proprietary tagging bits, the real challenge is the control plane.

read more see 5 comments

Reliability of Clustered Solutions: Another Data Point

A while ago I wrote:

I haven’t seen any hard data, but intuition suggests that apart from hardware failures a standalone firewall might be more stable than a state-sharing firewall cluster.

Guillaume Sachot (working for a web hosting company) sent me his first-hand experience on this topic:

read more see 6 comments

Becoming a Programmer on Software Gone Wild

During our summer team-building podcast we agreed it would be fun to record a few episodes along the “how do I become a programmer” theme and figured out that Elisa Jasinska would be a perfect first candidate.

A few weeks ago we finally got together and started our chat with campfire stories remembering how we got started with networking and programming.

read more Add comment

L3 Virtualization and VRFs

I got into an interesting discussion with Johannes Luther on the need for VRFs and he wrote:

If VRF = L3 virtualization technologies, then I saw that link. However, VRFs are again just a tiny piece of the whole story.

Of course he’s right, but it turns out that VRFs are the fundamental building block of most L3 virtualization technologies using a shared infrastructure.

read more see 2 comments

Network Automation: Lego Bricks and Death Stars

One of the challenges traditional networking engineers face when starting their network automation journey is the “build or buy” decision: should I use a plethora of small open-source or commercial tools and components and build my own solution, or should I buy a humongous platform from a reassuringly-expensive $vendor.

Most of us were used to buying platforms ranging from CiscoWorks to HP OpenView (oops, Business Technology Optimization Software) or now Cisco’s NSO, so it’s natural that we’re trying to map this confusing new world into old patterns, leading to interesting discussions like the one I had during one of my workshops:

read more see 4 comments

Could You Use IS-IS Instead of BGP for Routing on Hosts?

One of my readers sent me an interesting question a while ago:

Isn’t IS-IS a better fit for building L3-only networks than BGP, particularly considering that IS-IS already has a protocol to communicate with the end systems (ES-IS)?

In theory, he’s correct (see also this blog post).

read more see 5 comments

Optimize Your Data Center: How Far Did We Get?

Our Data Center optimization journey has finished. We virtualized the workloadgot rid of legacy technologies, reduced the number of server uplinks, replaced storage arrays with distributed file system and replaced physical firewalls and load balancers with virtual appliances.

Let’s see what’s left: it turns out you really don’t need more than two switches in most data centers.

Add comment

Optimal Inter-AS Routing Challenge

I encountered an ancient problem during one of my ExpertExpress engagements:

  • Customer network is split into two autonomous systems (core and access);
  • Links within access network are way slower than links within core network;
  • Customer would like to have optimal core-to-access traffic flow.

Challenge: what’s the simplest possible configuration to get it done?

read more see 11 comments

Breaking News: I’m a Vendor Shill

Got this comment on my Network Automation RFP Requirements blog post:

Looks like you are paid shill for Brocade based on the quote earlier in your blog "The Pass/Fail information included below was collected to the best of my knowledge with extensive help from Jason Edelman, Nick Buraglio, David Barroso and several Brocade engineers (THANK YOU!)."

Hooray, one more accolade to add to my list of accomplishments. And now for a few more details:

read more see 3 comments

First Speakers in the Spring 2017 Data Center Course

It’s only two weeks since the last live session of the Autumn 2016 Data Center course in which Mitja Robas did a fantastic job describing a production deployment of VMware NSX on top of Cisco Nexus 9000 network, and we already have the first speakers for the Spring 2017 event:

  • Scott Lowe (now at VMware) will talk about the role of open source in data center infrastructure;
  • Thomas Wacker (UBS AG) will talk about their fully automated data center deployments;
  • Andrew Lerner and Simon Richard (Gartner) will participate in a panel discussion on data center and networking trends.
Add comment

NAPALM Update on Software Gone Wild

We did a podcast describing NAPALM, an open-source multi-vendor abstraction library, a while ago, and as the project made significant progress in the meantime, it was time for a short update.

NAPALM started as a library that abstracted the intricacies of network device configuration management. Initially it supported configuration replace and merge; in the meantime, they added support for diffs and rollbacks

read more Add comment

To API or Not To API

One of my readers left this comment (slightly rephrased) on my Network Automation RFP Requirements blog post:

Given that we look up to our *nix pioneers as standard bearers for system automation, why do we demand an API from network devices? The API requirement would make sense if the vendor OS is a closed system. If an open system vendor creates APIs for applications running on their system (say for BGP configs) - kudos to them, but I no longer think that should be mandated.

He’s right - API is not a mandatory prerequisite for reliable network automation.

read more see 3 comments

New Webinar: Networks, Buffers and Drops

Do you need large buffers in data center switches or not? If you’re a vendor your take obviously depends on whether you have them or not, and then there are people saying “it’s bullshit” (mostly agree) and “look, I have a shinier toy” (get lost).

Unfortunately, it’s really hard to get someone who would know what he’s talking about, and be relatively unbiased.

read more Add comment

The Network Is Reliable and Other Stories

I was cleaning my Blog Post Ideas Evernote notebook and found these gems hidden deep inside its bowels:

I still haven’t found the presentation in which someone (from Facebook?) explained how long DNS information with long-expired TTLs persists in the clients. Relevant links would be highly appreciated.

read more see 2 comments

Do You Use SSL between Load Balancers and Servers?

One of my readers sent me this question:

Using SSL over the Internet is a must when dealing with sensitive data. What about SSL between data center components (frontend load-balancers and backend web servers for example)? Does it make sense to you? Can the question be summarized as "do I trust my Datacenter network team"? Or is there more at stake?

In the ideal world in which you’d have a totally reliable transport infrastructure the answer would be “There’s no need for SSL across that infrastructure”.

read more see 5 comments

Do Enterprises Need VRFs?

One of my readers sent me a long of questions titled “Do enterprise customers REALLY need VRFs?

The only answer I could give is “it depends” (it’s like asking “Do animals need wings?”), and here’s my attempt at building a decision tree:

You can use the decision tree to figure out whether you need VRFs in your data center or in your enterprise WAN.

read more see 5 comments

Save the date: Leaf-and-Spine Fabric Design Workshop in Zurich

Do you believe in vendor-supplied black box (regardless of whether you call it ACI or SDDC) or in building your own data center fabric using solid design principles?

It should be an easy choice if believe a business should control its own destiny instead of being pulled around by vendor marketing (to paraphrase Russ White)

read more Add comment

One of the better explanations of SDN

Stumbled upon this via HighScalability:

Every time I feel like I'm "out of touch" with the hip new thing, I take a weekend to look into it. I tend to discover that the core principles are the same [...]; or you can tell they didn't learn from the previous solution and this new one misses the mark, but it'll be three years before anyone notices (because those with experience probably aren't touching it yet, and those without experience will discover the shortcomings in time.)

Yep, that explains the whole centralized control plane ruckus ;) Read also a similar musing by Ethan Banks.

Add comment

Fast Linux Packet Forwarding with Thomas Graf on Software Gone Wild

We did several podcasts describing how one could get stellar packet forwarding performance on x86 servers reimplementing the whole forwarding stack outside of kernel (Snabb Switch) or bypassing the Linux kernel and moving the packet processing into userspace (PF_Ring).

Now let’s see if it’s possible to improve the Linux kernel forwarding performance. Thomas Graf, one of the authors of Cilium claims it can be done and explained the intricate details in Episode 64 of Software Gone Wild.

read more see 6 comments

Network Automation RFP Requirements

After finishing the network automation part of a recent SDN workshop I told the attendees “Vote with your wallet. If your current vendor doesn’t support the network automation functionality you need, move on.

Not surprisingly, the next question was “And what shall we ask for?” Here’s a short list of ideas, please add yours in comments.

read more see 23 comments

Do I Need Redundant Firewalls?

One of my readers sent me this question:

I often see designs involving several more than 2 DCs spread over different locations. I was actually wondering if that makes sense to bring high availability inside the DC while there's redundancy in place between the DCs. For example, is there a good reason to put a cluster of firewalls in a DC, when it is possible to quickly fail over to another available DC, as a redundant cluster increases costs, licenses and complexity.

Rule#1 of good engineering: Know Your Problem ;) In this particular case:

read more see 2 comments

Check Out the Designing Active-Active and Disaster Recovery Data Centers Webinar

The featured webinar in October 2016 is the Designing Active-Active and Disaster Recovery Data Centers webinar, and the featured videos include the discussion of disaster avoidance challenges and the caveats you might encounter with long-distance vMotion. All subscribers can view these videos, if you’re not one of them yet start with the trial subscription.

As a trial subscriber you can also use this month's featured webinar discount to purchase the webinar.

Add comment

Optimize Your Data Center: Virtual Appliances

We got pretty far in our Data Center optimization journey. We virtualized the workloadgot rid of legacy technologies, and reduced the number of server uplinks and replaced storage arrays with distributed file system.

Final step on the journey: replace physical firewalls and load balancers with virtual appliances.

Add comment

Ansible versus Puppet in Initial Device Provisioning

One of the attendees of my Building Next-Generation Data Center course asked this interesting question after listening to my description of differences between Chet/Puppet and Ansible:

For Zero-Touch Provisioning to work, an agent gets installed on the box as a boot up process that would contact the master indicating the box is up and install necessary configuration. How does this work with agent-less approach such as Ansible?

Here’s the first glitch: many network devices don’t ship with Puppet or Chef agent; you have to install it during the provisioning process.

read more see 6 comments

Use VRFs to Solve Routing-on-Hosts Challenges

One of my readers sent me interesting feedback after reading my explanation of why I’d try not to use OSPF as a routing protocol between hosts and ToR switches. He said:

Unfortunately we can’t use BGP because IBM mainframes support only OSPF or RIP, so we decided to use VRFs instead.

Here’s what they did:

read more see 3 comments

Survey on IXP Routing and Privacy

Marco Canini from UC Louvain is working on an IXP research project focused on bringing privacy guarantees into Internet routing context. They’re trying to understand the privacy considerations of network operators and have created a short survey to gather the initial data.

Researchers from UC Louvain have been involved in tons of really useful projects including BGP PIC, LFA, MP-TCP, Fibbing, Software-defined IXP and flow-based load balancing, so if you’re connected to an IXP, please take your time and fill in the survey.

see 1 comments

Distributed On-Demand Network Testing (ToDD) with Matt Oswalt

In March 2016 my friend Matt Oswalt announced a distributed network testing framework that he used for validation in his network automation / continuous integration projects. Initial tests included ping and DNS probes, and he added HTTP testing in May 2016.

The project continues to grow (and already got its own Github and documentation page) and Matt was kind enough to share the news and future plans in Episode 63 of Software Gone Wild.

To ask questions about the project, join the Todd channel on networktocode Slack team (self-registration at

Add comment

Replacing FabricPath with VXLAN, EVPN or ACI?

One of my friends plans to replace existing FabricPath data center infrastructure, and asked whether it would make sense to stay with FabricPath (using the new Nexus 5600 switches) or migrate to ACI.

I proposed a third option: go with simple VXLAN encapsulation on Nexus 9000 switches. Here’s why:

read more see 15 comments

Policing or Shaping? It Depends

One of my readers watched my TCP, HTTP and SPDY webinar and disagreed with my assertion that shaping sometimes works better than policing.

TL&DR summary: policing = dropping excess packets, shaping = delaying excess packets.

Here’s the picture he sent me (watch the video to get the context and read this article to get the background details):

read more see 5 comments

How Do I Get a Grasp of SDN and NFV?

One of my readers had problems getting the NFV big picture (and how it relates to SDN):

I find the topic area of SDN and NFV a bit overwhelming in terms of information, particularly the NFV bit.

NFV is a really simple concept (network services packaged in VM format), what makes it complex is all the infrastructure you need around it.

read more see 5 comments

How Many vMotion Events Can You Expect in a Data Center?

One of my friends sent me this question:

How many VM moves do you see in a medium and how many in a large data center environment per second and per minute? What would be a reasonable maximum?

Obviously the answer to the first part is it depends (please share your experience in the comments), so we’ll focus on the second one. It’s time for another Fermi estimate.

read more see 3 comments

Docker Networking: Introduction to Microservices and Containers

Dinesh Dutt started his excellent Docker Networking webinar with introduction to the concepts of microservices and Linux containers. You won’t find any deep dives in this part of the webinar, but all you need to do to get the details you’re looking for is to fill in the registration form.

Add comment

Why Would I Use BGP and not OSPF between Servers and the Network?

While we were preparing for the Cumulus Networks’ Routing on Hosts webinar Dinesh Dutt sent me a message along these lines:

You categorically reject the use of OSPF, but we have a couple of customers using it quite happily. I'm sure you have good reasons and the reasons you list [in the presentation] are ones I agree with. OTOH, why not use totally stubby areas with the hosts being in such an area?

How about:

read more see 7 comments

This Is Why I’m Not Doing SD-WAN Webinars

One of my long-time regular readers sent me this question:

I was wondering if you have had any interest in putting together an SD-WAN overview/update similar to what you do with data center fabrics where you cover the different product offerings, differentiators, solution scorecard…

That would be a good idea. Unfortunately the SD-WAN vendors aren’t exactly helping.

read more see 18 comments

Juniper Is Serious about OpenConfig and IETF YANG Data Models

When people started talking about OpenConfig YANG data models, my first thought (being a grumpy old XML/XSLT developer) was “that should be really easy to implement for someone with XML-based software and built-in XSLT support” (read: Junos with SLAX).

Here’s how my simplistic implementation would look like:

read more see 3 comments

The Cost of Networking Has Not Declined

One of the common taglines parroted by SDN aficionados goes along the lines of “The cost to acquire and manage server and storage architectures has declined over time while networking stays stubbornly expensive.” (I took it straight from an anonymous blog comment).

Let’s see how well it matches reality.

read more see 9 comments

Getting Started in the Mobile World

Got this challenge from one of my readers:

I've recently changed jobs and I am currently working for a telco. The problem is that I have no idea of what they are talking about when they mention SGSN, GGSN, Gi, Gn, etc... I only know routing and switching stuff :(.

Obviously he tried to search for information and failed.

read more see 14 comments

Whitebox Switching at LinkedIn with Russ White on Software Gone Wild

When LinkedIn announced their Project Falco I knew exactly what one of my future Software Gone Wild podcasts would be: a chat with Russ White (Mr. CCDE, now network architect @ LinkedIn).

It took us a long while (and then the summer break intervened) but I finally got it published: Episode 62 is waiting for you.

see 1 comments


Here are the outlines of an interesting ExpertExpress discussion:

  • A global organization wanted to connect data centers across the globe with a new transport backbone.
  • All the traffic has to be encrypted.

Should they buy L2VPN and use MACsec on it or L3VPN and use GETVPN on it (considering they already have large DMVPN deployments in each region)?

read more see 5 comments

How Do I Persuade My Management Automation Makes Sense?

Matt Oswalt made two great points while tweeting about my Automation Gone Wild blog post:

  • Automation should be a strategy. You need management buy-in;
  • You should have at least one person with strong software development experience in your automation team.

However, life is not always rosy, so @stupidengineer asked:

read more see 2 comments

OSPF Areas and Summarization: Theory and Reality

While most readers, commenters, and Twitterati agreed with my take on the uselessness of OSPF areas and inter-area summarization in 21st century, a few of them pointed out that in practice, the theory and practice are not the same. Unfortunately, most of those counterexamples failed due to broken implementations or vendor “optimizations”.

read more Add comment

OpenStack on VMware NSX on Software Gone Wild

Does it make sense to run OpenStack on top of VMware infrastructure? How well does NSX work as a Neutron plug-in? Marcos Hernandez answered these questions (and a lot of others) in the Episode 61 of Software Gone Wild (admittedly after a short marketing pitch in the first 10 minutes).

see 2 comments

Running BGP between Virtual Machine and ToR Switch

One of my readers left this question on the blog post resurfacing the idea of running BGP between servers and ToR switches:

When using BGP on a VM for mobility, what is the best way to establish a peer relationship with a new TOR switch after a live migration? The VM won't inherently know the peer address or the ASN.

As always, the correct answer is it depends.

read more see 7 comments

Questions about Network Automation Workshop

Marcel Reuter sent me a few questions about my upcoming Network Automation workshop. You might find them interesting, so here they are:

We have a lab with virtual IOS-XE, IOS-XR and Junos (vMX) router. I would like to learn how to provisioning the Lab router.

Covered in the workshop. I’m focusing on vIOS (which is pretty close to IOS Classic and IOS-XE) and Nexus OS because that’s what I can get up and running quickly in VIRL.

read more see 1 comments

Do We Still Need OSPF Areas and Summarization?

One of my ExpertExpress design discussions focused on WAN network design and the need for OSPF areas and summarization (the customer had random addressing and the engineers wondered whether it makes sense to renumber the network to get better summarization).

I was struggling with the question of whether we still need OSPF areas and summarization in 2016 for a long time. Here are my thoughts on the topic; please share yours in the comments.

read more see 3 comments

Using BGP in Leaf-and-Spine Fabrics

In the Leaf-and-Spine Fabric Designs webinar series we started with the simplest possible design: non-redundant server connectivity with bridging within a ToR switch and routing across the fabric.

After I explained the basics (including routing protocol selection, route summarization, link aggregation and addressing guidelines), Dinesh Dutt described how network architects use BGP when building leaf-and-spine fabrics.

Add comment

Why Is Stretched ACI Infinitely Better than OTV?

Eluehike Chedu asked an interesting question after my explanation of why stretched ACI fabric (or alternatives, see below) is the least horrible way of stretching a subnet: What about OTV?

Time to go back to the basics. As Dinesh Dutt explained in our Routing on Hosts webinar, there are (at least) three reasons why people want to see stretched subnets:

read more see 11 comments

Planning for Migration into the Cloud?

One of my readers sent me this question:

Have you written something about assessment and planning for migration of traditional in-premise data center network to private or public cloud? There would be hundreds of things to check during assessment and then plan accordingly.

Academically, that’s a wrong way of approaching the problem.

read more see 2 comments

Network Automation in Enterprise environments: pipe dream or reality?

When I talk about network automation with enterprise engineers I usually get responses along the lines of “That’s interesting, but it will never happen in my organization. That’s what startups or cloud providers do.

They couldn’t be more wrong: Thomas Wacker from UBS (one of the top 20 global financial services companies in case you don’t recognize the name) will describe how UBS uses network automation in new data center deployments during our Network Automation DIGS SDN event on September 1st, and we’ll spend the rest of the afternoon focusing on how you could get started and what your first network automation project should be.

read more see 6 comments

Scaling L3-Only Data Center Networks

Andrew wondered how one could scale the L3-only data center networking approach I outlined in this blog post and asked:

When dealing with guests on each host, if each host injects a /32 for each guest, by the time the routes are on the spine, you're potentially well past the 128k route limit. Can you elaborate on how this can scale beyond 128k routes?

Short answer: it won’t.

read more see 7 comments

Software-Defined Navel Gazing

Software Gone Wild podcast is well into its toddler years and it was time for a teambuilding exercise. Just kidding – we wanted to test new tools and decided to discuss the vacation experiences and podcast ideas while doing that.

On a more serious note: we’re always looking for cool projects, implementations and ideas. Contact us at podcast (-the weird sign-)

see 3 comments

Why Would I Attend the Virtual Firewalls Workshop?

One of my subscribers considered attending the Virtual Firewalls workshop on September 1st and asked:

Would it make sense to attend the workshop? How is it different from the Virtual Firewalls webinar? Will it be recorded?

The last answer is easy: No. Now for the other two.

read more see 3 comments

Networking Is Infrastructure – Get Used to It

Jeff Sicuranza left a great comment to one of my blog posts:

Still basically the same old debate from 25 years ago that experienced Network Architects and Engineers understood during technology changes; "Do you architect your network around an application(s) or do you architect your application(s) around your network"

I would change that to “the same meaningless debate”. Networking is infrastructure; it’s time we grow up and get used to it.

read more see 5 comments

Sample Ansible Networking Playbooks on Github

I spent the last week creating numerous scenarios using Ansible networking modules for my upcoming Network Automation workshop. The scenarios use Cisco IOS and Nexus OS modules as I used VIRL for network simulation, but you could easily adapt them to other networking devices.

All the scenarios I’m covering in the workshop are available in my Github repository; to get the them explained you’ll have to attend the workshop. Enjoy!

Add comment

New Webinar: Docker Networking Fundamentals

After the fantastic Docker 101 webinar by Matt Oswalt a few people approached me saying “that was great, but we’d need something more on Docker networking”, and during one of my frequent chats with Dinesh Dutt he mentioned that he already had the slides covering that topic.

Problem solved… and Dinesh decided to do it as a free webinar (thank you!), so all you have to do is register. Hurry up, there are only 1000 places left ;)

see 2 comments

We Need to Educate Our Peers

Failure to use DNS, IP addresses embedded in the code, ignoring the physical realities (like bandwidth and latency)… the list of mistakes that eventually get dumped into networking engineer’s lap is depressing.

It’s easy to reach the conclusion that the people making those mistakes must be stupid or lazy… but in reality most of them never realized they were causing someone else problems because nobody told them so.

read more see 1 comments

And this is why you need automation

I stumbled upon a great description of how you can go bankrupt in 45 minutes due to a manual deployment process. The most relevant part of it:

Any time your deployment process relies on humans reading and following instructions you are exposing yourself to risk. Humans make mistakes. The mistakes could be in the instructions, in the interpretation of the instructions, or in the execution of the instructions.

And no, it's not just application deployment. A similar disaster could happen in your network.

see 7 comments

SDN and Modern Physics

I stumbled upon a great ACM article comparing challenges of distributed systems with well-known milestones of modern physics.

The modern networks are probably the ultimate distributed systems. Now take the ideas from that article and apply them to the Centralized Control Plane concept (the last time I checked the marketers were still promoting that academic marvel).

Add comment

And this is how you build an IPv6-only data center

Tore Anderson has been talking about IPv6-only data centers (and running a production one) for years. We know Facebook decided to go down that same path… but how hard would it be to start from scratch?

Not too hard if you want to do it, know what you're doing, and are willing to do more than buy boxes from established vendors. Donatas Abraitis documented one such approach, and he's not working for a startup but a 12-year-old company. So, don't claim it's impossible ;)

see 1 comments

Stretched ACI Fabric Is Sometimes the Least Horrible Solution

One of my readers sent me a lengthy email asking my opinion about his ideas for new data center design (yep, I pointed out there’s a service for that while replying to his email ;). He started with:

I have to design a DR solution for a large enterprise. They have two data centers connected via Fabric Path.

There’s a red flag right there…

read more see 6 comments

TCP Congestion Avoidance on Satellite Links

While some people spread misinformation others work hard to figure out how to make TCP work on exotic links with low bandwidth and one second RTT.

Ulrich Speidel published a highly interesting article on APNIC blog describing the challenges of satellite Internet access and the approach (network coded TCP) they took to avoid them.

read more see 11 comments

Ethernet-over-VPN: What Could Possibly Go Wrong?

One of my readers sent me a link to SoftEther, a VPN solution that

[…] penetrates your network admin's troublesome firewall for overprotection. […] Any deep-packet inspection firewalls cannot detect SoftEther VPN's transport packets as a VPN tunnel, because SoftEther VPN uses Ethernet over HTTPS for camouflage.

What could possibly go wrong with such a great solution?

read more see 10 comments

OpenFlow and Firewalls Don’t Mix Well

In one of my ExpertExpress engagements the customer expressed the desire to manage their firewall with OpenFlow (using OpenDaylight) and I said, “That doesn’t make much sense”. Here’s why:

Obviously if you can't imagine your life without OpenDaylight, or if your yearly objectives include "deploying OpenDaylight-based SDN solution", you can use it as a REST-to-NETCONF translator assuming your firewall supports NETCONF.

read more Add comment

Automate the Exceptions

Every time I have a network automation presentation (be it a 2-day workshop or a 45 minute keynote) I get the same question afterwards: “How do we deal with exceptions?

The correct answer is obvious: “there should be no exceptions, because one-offs usually cost you more than you earn with them,” but as always the reality tends to intervene.

read more Add comment

Optimize Your Data Center: Use Distributed File System

Let’s continue our journey toward two-switch data center. What can we do after virtualizing the workload, getting rid of legacy technologies, and reducing the number of server uplinks to two?

How about replacing dedicated storage boxes with distributed file system?

In late September, Howard Marks will talk about software-defined storage in my Building Next Generation Data Center course. The course is sold out, but if you register for the spring 2017 session, you’ll get access to recording of Howard’s talk.

Add comment

Why Is Every SDN Vendor Bashing the Networking Engineers?

This blog post was written almost two years ago (and sat half-forgotten in a Word file somewhere in my Dropbox), but as it seems not much has changed in the meantime, it’s time to publish it anyway.

I was listening to the fantastic SDN Trinity podcast while biking around Slovenian hills and almost fell off the bike while furiously nodding to a statement along the lines of “I hate how every SDN vendor loves to bash networking engineers.”

read more see 14 comments

Cutting through the IPv6 Requirements Red Tape

Few years ago a bunch of engineers agreed that the customers need a comprehensive “IPv6 Buyer’s Guide” and thus RIPE-554 was born. There are also IPv6 certification labs, US Government IPv6 profile and other initiatives. The common problem: all these things are complex.

However, it’s extremely easy to get what you want as Ron Broersma explained during his presentation at recent Slovenian IPv6 meeting. All it takes is a single paragraph in the RFP saying something along these lines:

The equipment must have the required functionality and performance in IPv6-only environment.

Problem solved (the proof is left as an exercise for the reader… or you could cheat and watch Ron’s presentation, which you should do anyway ;).

see 2 comments

Does It Make Sense to Build Your Own Networking Solutions?

One of my readers was listening to the Snabb Switch podcast and started wondering “whether it’s possible to leverage and adopt these bleeding-edge technologies without a substantial staff of savvy programmers?

Short answer: No. Someone has to do the heavy lifting, regardless of whether you have programmers on-site, outsource the work to contractors, or pay vendors to do it.

read more see 10 comments

Build Your Own Service Provider Gear on Software Gone Wild

A few days after I published a blog post arguing that most service providers cannot possibly copy Google’s ideas Giacomo Bernardi wrote a comment saying “well, we managed to build our own gear.

Initially I thought they built their own Linux distribution on top of x86 server, but what Giacomo Bernardi described in Episode 59 of Software Gone Wild goes way beyond that:

read more see 4 comments

Optimize Your Data Center: Reduce the Number of Uplinks

Remember our journey toward two-switch data center? So far we:

Time for the next step: read a recent design guide from your favorite hypervisor vendor and reduce the number of server uplinks to two.

Not good enough? Building a bigger data center? There’s exactly one seat left in the Building Next Generation Data Center online course.

Add comment

Directed ARP Saga Continues

Reading my Directed ARP and ICMP Redirects blog post you might have wondered “how did Directed ARP ever get into ***redacted***?”

I searched for “directed ARP cisco” and found this gem, which really talks about unicast ARP behavior, an ancient mechanism documented in RFC 1122 (it’s not my Google-Fu, I got the reference to RFC 1122 in this blog post).

read more see 3 comments

On the Lossiness of TCP

When someone tells you that “TCP is a lossy protocol” during a job interview, don’t throw him out immediately – he was just trusting the Internet a bit too much (click to enlarge).

Everyone has a bad hair day, and it really doesn’t matter who published that text… but if you’re publishing technical information, at least try to do no harm.

read more see 10 comments

Where Is the Explosion of Overlay Virtual Networks

Three years ago I was speaking with one of the attendees of my overlay virtual networking workshop @ Interop Las Vegas and he asked me how soon I thought the overlay virtual networking technologies would be accepted in the enterprise networks.

My response: “you might be surprised at the speed of the uptake.” Turns out, I was wrong (again). Today I’m surprised at the lack of that speed.

read more see 7 comments

Big Chain Deep Dive on Software Gone Wild

A while ago Big Switch Networks engineers realized there’s a cool use case for their tap aggregation application (Big Tap Monitoring Fabric) – an intelligent patch panel traffic steering solution used as security tool chaining infrastructure in DMZ… and thus the Big Chain was born.

Curious how their solution works? Listen to Episode 58 of Software Gone Wild with Andy Shaw and Sandip Shah.

Add comment

Directed ARP and ICMP Redirects

One of my readers sent me this question:

When I did my ***redacted*** I encountered a question about Directed ARP. The RFC ( is in the "experimental" stage, and I found it really weird from ***** to include such a hidden gem in the ***redacted***.

Directed ARP is clearly one of those weird things that people were trying out in the early days of networking when packet forwarding and bandwidth were still expensive (read the RFC for more details), but I kept wondering “what exactly is going on when a host receives an ICMP redirect?” Time for a hands-on test.

read more see 11 comments

Is OVSDB a Control- or Management-Plane Protocol?

A while ago I discussed whether XMPP is a control- or management-plane protocol (spoiler: it depends). How about OVSDB? Here’s another question from one of my readers:

Why is Openflow considered as control plane protocol and OVSDB management plane protocol if both are relying on SDN controller? Is it because Openflow can directly modify the dataplane?

SDN controllers can use control- or management-plane protocols to get the job done.

read more see 2 comments

Virtual Firewalls: Featured Webinar in June 2016

Virtual Firewalls is the featured webinar in June 2016, and the featured videos (marked with a star) explain the difference between virtual contexts and virtual appliances, and the virtual firewalls taxonomy.

To view the videos, log into (or enroll into the trial subscription if you don’t have an account yet), select the webinar from the first page, and watch the videos marked with star.

If you're a trial subscriber and would like to get access to the whole webinar, use this month's featured webinar discount (and keep in mind that every purchase brings you closer to the full subscription).

Add comment

SDN as an Abstraction Layer

During the Introduction to SDN webinar I covered numerous potential definitions:

I find all of these definitions too narrow or even misleading. However, the “SDN is a layer of abstraction” one is not too bad (see also RFC 1925 section 2.6a).

see 1 comments

Is BGP Really that Complex?

Anyone following the popular networking blogs and podcasts is probably familiar with the claim that BGP is way too complex to be used in whatever environment. On the other hand, more and more smart people use it when building their data center or WAN infrastructure. There’s something wrong with this picture.

read more see 10 comments

Using Macvlan and Ipvlan with Docker on Software Gone Wild

A few weeks after I published Docker Networking podcast, Brent Salisbury sent me an email saying “hey, we have experimental Macvlan and Ipvlan support for Docker” – a great topic for another podcast.

It took a while to get the stars aligned, but finally we got Brent, Madhu Venugopal, John Willis and Nick Buraglio on the same Skype call resulting in Episode 57 of Software Gone Wild.

see 1 comments

The Future of Multicast and QoS

A. Friend sent me a long list of questions after listening to excellent Future of Networking podcast with Martin Casado because (as he said) he prefers “having a technical discussion with arguments and not just throwing statements out there.

He started with “Martin's view seems to be that network is all plumbing and all the intelligence should be in the applications.

read more see 16 comments

What Is Software-Defined Security?

Gabi Gerber is organizing a Software-Defined Security event in Zurich next week in which I’ll talk about real-life security solutions that could be called software defined for whatever reason, and my friend Christoph Jaggi sent me a few questions trying to explore this particular blob of hype.

For obvious reasons he started with “Isn’t it all just marketing?

read more Add comment

Building a L2 Fabric on top of VXLAN: Arista or Cisco?

One of my readers working as an enterprise data center architect sent me this question:

I've just finished a one-week POC with Arista. For fabric provisioning and automation, we were introduced to CloudVision. My impression is that there are still a lot of manual processes when using CloudVision.

Arista initially focused on DIY people and those people loved the tools Arista EOS gave them: Linux on the box, programmability, APIs… However

read more see 13 comments

Optimize Your Data Center: Ditch the Legacy Technologies

In our journey toward two-switch data center we covered:

It’s time for the next step: get rid of legacy technologies like six 1GE interfaces per server or two FC interface cards in every server.

Need more details? Watch the Designing Private Cloud Infrastructure webinar. How about an interactive discussion? Register for the Building Next-Generation Data Center course.

see 5 comments

Feedback: Layer-2 Leaf-and-Spine Fabrics

Occasionally I get feedback that makes me say “it’s worth doing the webinars ;)”. Here’s one I got after the layer-2 session of Leaf-and-Spine Fabric Designs webinar:

I work at a higher level of the stack, so it was a real eye opener especially with so much opinionated "myths" on the web that haven't been critically challenged such as [the usefulness of] STP.

There’s more feedback on this web page where you can also buy the webinar recording (or register for the next session of the webinar once they are scheduled).

Add comment

Can Enterprise Workloads Run on Bare-Metal Servers?

One of my readers left a comment on my “optimize your data center by virtualizing the serversblog post saying (approximately):

Seems like LinkedIn did it without virtualization :) Can enterprises achieve this to some extent?

Assuming you want to replace physical servers with one or two CPU cores and 4GB of memory with modern servers having dozens of cores and hundreds of GB of memory the short answer is: not for a long time.

read more see 2 comments

Model-Driven Networking on Software Gone Wild

The Model-driven Networking seems to be another buzzword riding on top of the SDN wave. What exactly is it, how is it supposed to work, will it be really vendor-independent, and has anyone implemented it? I tried to get some answers to these questions from Jeff Tantsura, chair of IETF Routing Area Working Group, in Episode 55 of Software Gone Wild.

read more see 3 comments

OpenStack Networking, Availability Zones and Regions

One of my ExpertExpress engagements focused on networking in a future private cloud that might be built using OpenStack. The customer planned to deploy multiple data centers, and I recommended that they do everything they can to make sure they don’t make them a single failure domain.

Next step: translate that requirement into OpenStack terms.

read more see 5 comments

Yeah, Blame It on Cisco

A Technology Market Builder (in his own words) from a major networking vendor decided to publish a thought leadership article (in my sarcastic words) describing how Cisco’s embrace of complexity harmed the whole networking industry.

Let’s see how black this kettle-blaming pot really is ;), and make sure to have fun reading the comments to the original article.

read more see 12 comments

What Are The Problems with Broadcom Tomahawk? We Don’t Know

One of my readers has customers that already experienced performance challenges with Tomahawk-based data center switches. He sent me an email along these lines:

My customers are concerned about buffer performance for packets that are 200 bytes and under. MORE IMPORTANTLY, a customer informed me that there were performance issues when running 4x25GE connections when one group of ports speaks to another group.

Reading the report Mellanox published not so long ago it seems there really is something fishy going on with Tomahawk.

read more see 7 comments

Unexpected Recovery Might Kill Your Data Center

Here’s an interesting story I got from one of my friends:

  • A large organization used a disaster recovery strategy based on stretched IP subnets and restarting workloads with unchanged IP addresses in a secondary data center;
  • Once they experienced a WAN connectivity failure in the primary data center and their disaster recovery plan kicked in.


read more see 4 comments

Software-Defined Security and VMware NSX Events

I’m presenting at two Data Center Interest Group Switzerland events organized by Gabi Gerber in Zurich in early June:

  • In the morning of June 7th we’ll talk about software-defined security, data center automation and open networking;
  • In the afternoon of the same day (so you can easily attend both events) we’ll talk about VMware NSX microsegmentation and real-life implementations.

I hope to see you in Zurich in a bit more than a month!

see 4 comments

Response: Are Open-Source Controllers Ready for Carrier-Grade Services?

My beloved source of meaningless marketing messages led me to a blog post with a catchy headline: are open-source SDN controllers ready for carrier-grade services?

It turned out the whole thing was a simple marketing gig for Ixia testers, but supposedly “the response of the attendees of an SDN event was overwhelming”, which worries me… or makes me happy, because it’s easy to see plenty of fix-and-redesign work in the future.

read more see 3 comments

More Open-Source Network Management Tools on Software Gone Wild

After listening to Open-Source Network Engineer Toolbox Nick Buraglio sent me an email saying “we should do another podcast on open-source network management tools…” and so we did. In Episode 56 of Software Gone Wild Nick, Elisa Jasinska and myself discussed a whole range of network management challenges and open-source tools you can use to address them.

read more see 1 comments

Implementing BGP-Based SDN Controller

One of my readers sent me this observation while reviewing my BGP-Based SDN Solutions webinar:

I am a bit surprised the SDN controller can actually be so lightweight.

Well, that's the benefit of augmenting an existing well-developed ecosystem instead of reinventing the wheel and reimplementing every single bit of functionality we had to develop to make networks work throughout the last 5 decades.

read more see 1 comments

Optimize Your Data Center: Virtualize Your Servers

A month ago I published the video where I described the idea that “two switches is all you need in a medium-sized data center”. Now let’s dig into the details: the first step you have to take to optimize your data center infrastructure is to virtualize all servers.

For even more details, watch the Designing Private Cloud Infrastructure webinar, or register for the Building Next-Generation Data Center course.

see 1 comments

Scalability of OpenFlow Control Plane Network

This article was initially sent to my SDN mailing list. To register for SDN tips, updates, and special offers, click here.

I got an interesting question from one of my readers:

If every device talking to a centralized control plane uses an out-of-band channel to talk to the OpenFlow controller, isn’t this a scaling concern?

A year or so ago I would have said NO (arguing that the $0.02 CPU found in most networking devices is too slow to overload a controller or reasonably-fast control-plane network).

read more see 3 comments

Some People Don’t Get It: It Will Eventually Fail

Mark Baker left this comment on my Stretched Firewalls across Layer-3 DCI blog post:

Strange how inter-DC clustering failure is considered a certainty in this blog.

Call it experience or exposure to a larger dataset. Anything you build will eventually fail; just because you haven’t experienced the failure yet doesn’t mean that the system will never fail but only that you were lucky so far.

read more see 8 comments

First Guest Speaker in Building Next-Generation Data Center Course

When I started thinking about my first online course, I decided to create something special – it should be way more than me talking about cool new technologies and designs – and the guest speakers are a crucial part of that experience.

The first guest speaker is one of the gurus of network design and complexity, wrote numerous books on the topic, and recently worked on a hardware-independent network operating system.

read more see 1 comments

More on Reading and Writing Books

Russ White wrote a great response to my “Do You Really Want to Write that Book?” blog post and I couldn’t agree more with what he wrote. Unfortunately, he seems to be a bit over-idealistic when analyzing why the market for high-end content is so small.

You know I usually have a cynical explanation handy, so here it is: too many people calling themselves engineers for no particular reason simply don’t care. It’s way easier to Google-and-paste your way around than to invest time in understanding the fundamentals.

read more see 7 comments

Shortest Path Bridging (SPB) and Avaya Fabric on Software Gone Wild

A few months ago I met a number of great engineers from Avaya and they explained to me how they creatively use Shortest Path Bridging (SPB) to create layer-2, layer-3, L2VPN, L3VPN and even IP Multicast fabrics – it was clearly time for another deep dive into SPB.

It took me a while to meet again with Roger Lapuh, but finally we started exploring the intricacies of SPB, and even compared it to MPLS for engineers more familiar with MPLS/VPN. Interested? Listen to Episode 54 of Software Gone Wild.

Add comment