Building Network Automation Solutions
6 week online course starting in September 2017

Optimize Data Center Infrastructure: Virtualize Network Services

We’re almost done with our data center infrastructure optimization journey. In this step, we’ll virtualize the network services.

The Cost of Networking Hardware (and Disaggregation)

Eyvonne Sharp wrote an interesting blog post describing the challenges Cisco might have integrating Viptela acquisition, particularly the fact that Viptela has a software solution running on low-cost hardware.

Guess what… Cisco IOS also runs on low-cost hardware, it’s just that Cisco routers are sold as a software+hardware bundle masquerading as expensive hardware.

Feedback: Network Automation 101

Some networking practitioners start their network automation journey with the Python or Ansible dilemma. Engineers and architects usually want to understand the bigger picture first, and figure out the potential showstoppers and roadblocks. One of them left this feedback on the Network Automation 101 webinar:

A must-have overview of fundamental Network Automation concepts. I wouldn't face an automation project without understanding these concepts first.

Teach IPv6 First and Automate the Deployment

In mid-July dr. Olivier Bonaventure (one of the unsung networking heroes who’s always trying to address real-life problems instead of inventing unicorn solutions in search of a problem) sent an email to v6ops mailing list describing how they teach networking.

Short summary for differently-attentive:

Feedback: Open Networking for Large-Scale Networks

Got this feedback from a network architect attending the Open Networking for Large-Scale Networks webinar:

I used the webinar when preparing for a meeting/discussion with a NOS SW-vendor. In the meeting, my knowledge was completely up-to-speed & I was on the level with the vendor in the discussion! :-)

Obviously, Russ White and Shawn Zandi did a great job based on their real-life hands-on experience (they use whitebox switches @ LinkedIn).

Interview with Daniel Dib

Daniel Dib is setting up a networking career (from a down-to-earth engineer’s perspective) web site, and started populating it with numerous interviews with fellow networking engineers and architects (all of them well worth reading).

Here are my answers to his questions.

RFC 8212: Bringing Sane Defaults to EBGP

It’s amazing how long it can take to get some sanity into networking technologies. RFC 8212 specifies that a BGP router should not announce prefixes over EBGP until its routing policy has been explicitly configured. It took us only 22 years to get there…

For more technical details, read this email by Job Snijders.

Net Neutrality (Again and Again and Again)

Net neutrality is one of those topics that should never have existed, but of course it inevitably erupts every so often, so here we go…

Not so long ago Robert Graham published his anti-net-neutrality arguments which are (no surprise) not much different from what I wrote when I still cared about this argument (here, here, here and here). While I agree with his overall perspective, I completely disagree with his view of Comcast’s initial response to network congestion.

RFC8200: IPv6 Is an Internet Standard

You wouldn’t believe it – after almost 22 years (yeah, it’s been that long since RFC 1883 was published), IPv6 became an Internet standard (RFC8200/STD86). No wonder some people claim IETF moves at glacial speed ;)

Speaking of IPv6, IETF and glacial speeds – there’s been a hilarious thread before Prague IETF meeting heatedly arguing whether the default WLAN SSID should be IPv6-only (+NAT64). Definitely worth reading (for the entertainment value) over a beer or two.

New in Ansible for Networking Engineers

I’ve added two new case studies to Ansible for Networking Engineers online course:

Create network diagrams from LLDP information playbook focuses on creating a single summary report based on information from numerous devices (and the report just happens to be network diagram in DOT format).

PRTG Monitoring Software Now Available in Cloud Version

One of the more interesting presentations we had during Tech Field Day Extra @ Cisco Live Berlin was coming from Paessler, a company developing PRTG, a little-known network monitoring software.

More about PRTG in TFD videos and here, here, here and here.

RFC 8196: IS-IS Autoconfiguration

Finally a group of engineers figured out it’s a good idea to make things less complex instead of heaping layers of complexity on top of already-complex kludges.

RFC 8196 specifies default values and extensions to IS-IS that make it a true plug-and-play routing protocol. I wonder when we’ll see it implemented now that everyone is obsessed with intent-based hype.

Promises Gone Wild

I got several interesting replies to my automation and orchestration blog post. Some of them were so far in the land of alternate definitions that they were literally off the charts. Here’s one of the best I got in that category:

(Not-so-very) Early Network Automation

If you’re not old enough to know otherwise, you’d think (based on recent hype) that we discovered network automation a few years ago. Not true. One of my readers sent me a link to excellent Managing IP Networks with Free Software presentation from NANOG26 (October 2002).

I found the presentation awesome, nothing new, and extremely sad… all at the same time.

IPv6 Link-Local Addresses and VLAN Interfaces

One of my readers sent me an email that’s easiest paraphrased into: “Why can’t I have a different IPv6 link-local address (LLA) on every access port connected to a VLAN interface?

There’s probably nothing stopping someone from implementing such an approach, but it would go against the usual understanding of how bridging and routing interact in L2+L3 switches.

Q&A: Building Network Automation Solutions Online Course

I got tons of questions about the upcoming Building Network Automation Solutions online course. It always starts with the same one:

Is access to the self-study material granted upon enrollment?

Absolutely. You also get access to everything we did in January, and the new self-paced Ansible for Networking Engineers online course.

Automation or Orchestration?

Have you ever wondered what the difference between automation and orchestration is?

Wikipedia defines automation as use of various control systems for operating equipment. The definition I prefer (because it’s easier to understand in network automation environment) is elimination of well-defined repeatable manual tasks – the emphasis being on well-defined and repeatable.

Swimlanes, Read-Write Transactions and Session State

Another question from someone watching my Designing Active-Active and Disaster Recovery Data Centers webinar (you know, the one where I tell people how to avoid the world-spanning-layer-2 madness):

In the video about parallel application stacks (swimlanes) you mentioned that one of the options for using the R/W database in Datacenter A if the user traffic landed in Datacenter B in which the replica of the database is read-only was to redirect the user browser with the purpose that the follow up HTTP POST land in Datacenter A.

Here’s the diagram he’s referring to:

New in Ansible for Networking Engineers

Here’s the list of materials (and other changes) I added to the Ansible for Networking Engineers webinar and online course in June 2017.

The first thing you’ll notice is the brand-new user interface with collapsible sections, making it easier to grasp the big picture (the change was badly needed – the webinar is already almost 12 hours long).

Breaking News: SNMP-based NMS Can Replace SDN ;)

Got this remark from one of my SDN mailing list subscribers:

There are NMSs that are based on SNMP, their manufacturers that say they can replace an SDN architecture, because they allow to automate the management of the network.

O’RLY?

How Do I Start Automating Network Device Configurations in an Existing Network?

I get a “how do I get started with network automation” question every other week, and when I wrote a lengthy reply to one about configuration templating of existing snowflake network on networktocode Slack channel I decided it’s time to turn my replies into a blog post.

To Jumbo or Not to Jumbo?

Here’s the question I got from one of my readers:

Do you have any data available to show the benefits of jumbo frames in 40GE/100GE networks?

In case you’re wondering why he went down this path, here’s the underlying problem:

Solutions Published for Wireshark Challenges

Johannes Weber published the solutions to his Wireshark challenges. How many did you solve?

Sample Network Automation Ansible Playbooks

I developed over a dozen different Ansible-based network automation solutions in the last two years for my network automation workshops and online course, and always published them on GitHub… but never built an index, or explained what they do, and why I decided to do things that way.

With the new my.ipSpace.net functionality I added for online courses I got the hooks I needed to make the first part happen:

Asymmetrical Traffic Flows and Complexity

One of my readers sent me a list of questions on asymmetrical traffic flows in IP networks, particularly in heavily meshed environments (where it’s really hard to ensure both directions use the same path) and in combination with stateful devices (firewalls in particular) in the forwarding path.

Unfortunately, there’s no silver bullet (and the more I think about this problem, the more I feel it’s not worth solving).

Moving to Summer Schedule

The inevitable summer decline of visitors has started, so I'm switching (like every summer) to a lower publishing frequency. Given my current focus (here and here) expect one network automation post and one other in-depth post every week… and maybe an occasional this-is-worth-reading link.


Working in the summer office ;)

Take some time off, enjoy the vacations, and I hope to meet you in the September online course ;)

Monitoring SDN Networks: Featured Webinar in June 2016

Monitoring SDN Networks is the featured webinar of June 2017, and in the featured video Terry Slattery (CCIE#1026) talks about network analysis of SDN.

If you’re a trial subscriber, log into my.ipspace.net, select the webinar from the first page, and watch the video marked with star… and if you’d like to try the ipSpace.net subscription register here.

Trial subscribers can also use this month's featured webinar discount to get a 25% discount (and get closer to the full subscription).

Optimize Data Center Infrastructure: Use Distributed File System

Another part of my data center infrastructure optimization presentation is transcribed, edited and published: use distributed file system (at least for VM disk images).

First Speakers in Autumn Network Automation Course

Today I can tell you who the first speakers in the autumn 2017 network automation online course will be.

Sounds promising? Why don’t you register before we run out of early-bird tickets?

Want to Learn Something New? Learn Git!

If you'd come to me as a networking engineer and say “there's one new thing I want to learn that's outside of my $dayjob” I'd probably say “invest some serious time into learning Git (beyond memorizing the quick recipes) if you haven’t done that already”

Full disclosure: not so long ago I tried to avoid Git as much as possible… and then it suddenly clicked ;)

Packet Fabric on Software Gone Wild

Imagine a service provider that allows you to provision 100GE point-to-point circuit between any two of their POPs through a web site and delivers in seconds (assuming you’ve already solved the physical connectivity problem). That’s the whole idea of SDN, right? Only not so many providers got there yet.

New: Ansible for Networking Engineers Online Course

Long story short: I’m launching Ansible for Networking Engineers self-paced course today. It’s already online and you can start whenever you wish.

Now for the details…

Isn’t there already an Ansible for Networking Engineers webinar? Yes.

So what’s the difference? Glad you asked ;)

Leaf-and-Spine Fabrics: Implicit or Explicit Complexity?

During Shawn Zandi’s presentation describing large-scale leaf-and-spine fabrics I got into an interesting conversation with an attendee that claimed it might be simpler to replace parts of a large fabric with large chassis switches (largest boxes offered by multiple vendors support up to 576 40GE or even 100GE ports).

As always, you have to decide between implicit and explicit complexity.

Self-Study Exercises Added to Ansible for Networking Engineers Webinar

Last week I published self-study exercises for the YAML and Jinja2 modules in the Ansible for Networking Engineers webinars, and a long list of review questions for the Using Ansible and Ansible Deeper Dive sections.

I also reformatted the webinar materials page. Hope you’ll find the new format easier to read than the old one (it’s hard to squeeze over 70 videos and links on a single page ;).

Oh, and you do know you get Ansible webinar (and over 50 other webinars) with ipSpace.net subscription, right?

Where Do You Want to Move the Complexity?

Michael Klose left an interesting remark on my Regional Internet Exits in Large DMVPN Deployment blog post saying…

Would BGP communities work? Each regional Internet Exit announce Default Route with a Region Community and all spokes only import default route for their specific region community.

That approach would definitely work. However, you have to decide where to move the complexity.

Videos: PowerShell 101

Mitja Robas started his PowerShell for Networking Engineers presentation with a brief introduction to PowerShell and a few simple hands-on examples. Enjoy the videos ;)

Worth Reading: Career IT Advice from Industry Gurus

Neil Anderson collected career advice from 111 IT industry gurus (just getting all of them to respond must have been monumental effort). Well worth reading ;)

Cisco ACI, VMware NSX and Programmability

One of my readers sent me a lengthy email describing his NSX-versus-ACI views. He started with [slightly reworded]:

What I want to do is to create customer templates to speed up deployment of application environments, as it takes too long at the moment to set up a new application environment.

That’s what we all want. How you get there is the interesting part.

Webinars in First Half of 2017

The first half of 2017 is almost gone, so it’s time to check how far I got with the plans I made in January.

Delivered:

Is Anyone Using Open Daylight?

A while ago I sent out an email to my SDN and network automation mailing list (join here) asking whether anyone uses Open Daylight in anything close to a production environment (because I haven’t ever seen one).

Among many responses saying “not here” I got a polite email from VP of Marketing working for a company that sells OpenDaylight-related services listing tons of customer deployments (no surprise there).

Start Using OpenConfig with NAPALM on Software Gone Wild

OpenConfig sounds like a great idea, but unfortunately only a few vendors support it, and it doesn’t run on all their platforms, and you need the latest-and-greatest software release. Not exactly a set of conditions that would encourage widespread adoption.

Things might change with the OpenConfig data models supported in NAPALM. Imagine you could parse router configurations or show printouts into OpenConfig data structures, or use OpenConfig to configure Cisco IOS routers running a decade old software.

Optimize Data Center Infrastructure: Reduce the Number of Uplinks

The work of editing transcripts of my two switches presentation is (very slowly) moving forward. In the fourth part of the Optimize Your Data Center Infrastructure series I’m talking about reducing the number of uplinks.

Use Your Networking Knowledge to Design Automation Solution

I’m getting plenty of emails from not-so-very-young networking engineers trying to make career transitions. I got this one from a CCIE in his mid-40s:

Would you think the SDN and Data Center paths would be suitable for a long standing engineer?

Absolutely. It's just networking, although it's sometimes disguised a bit.

This article was initially sent to my Network Automation mailing list.

Worth Reading: Who’s Protecting the Cloud API

Everyone loves talking about cloud security (or lack thereof) and focuses on protecting workloads, data in the cloud… but have you ever asked the question “how protected is the cloud management API?

Webinars in This Week

The spring craziness is still in full swing – we’ll have three webinars this week (a first) and I was so busy I didn’t even have time to write about them. Let’s fix that.

Data Center Updates on Monday is the second part of server virtualization, virtual machines and containers update to Data Center 3.0 webinar. We covered virtual machines in the last session (April 25th), this time we’ll talk about containers.

David Barroso (now at Fastly) will talk about NAPALM in Ansible on Tuesday.

Let's build a small network automation solution!

Do you have the feeling that you should know more about network automation, but don't know where to start? I was facing that same problem in 2015, and then started exploring Ansible (plus YAML, Jinja2, Git, Puppet…), creating small playbooks, and finally came to a point where I said "now I know that you can have a small solution solving an actual problem ready in a few weeks even if you know absolutely nothing today".

Regional Internet Exits in Large DMVPN Deployment

One of my readers wanted to implement a large DMVPN cloud with regional Internet exit points:

We need to deploy a regional Internet exits and I’d like to centralize them.  Each location with a local Internet exit will be in a region and that location will advertise a default-route into the DMVPN domain to only those spokes in that particular region.

He wasn’t particularly happy with the idea of deploying access and core DMVPN clouds:

Worth Reading: Security and IoT

A great essay by Bruce Schneier about (lack of) security in IoT and why things won’t improve without some serious intervention.

Few Secrets of Successful Learning: Focus, Small Chunks, and Sleep

One of my readers sent me a few questions about the leaf-and-spine fabric architectures webinar because (in his own words)

We have some projects 100% matching these contents and it would be really useful this extra feedback, not just from consultants and manufacturer.

When I explained the details he followed up with:

Now, I expect in one or two weeks to find some days to be able to follow this webinar in a profitable way, not just between phone calls and emails.

That’s not how it works.

Network Testing on Software Gone Wild

Network automation and orchestration is a great idea… but how do you verify that what your automation script wants to do won’t break the network? In Episode 78 of Software Gone Wild we discussed the intricacies of testing network automation solutions with Kristian Larsson (developer of Terastream orchestration softare) and David Barroso of the NAPALM and SDN Internet Router fame.

Looking for a Tool to Create Device Configurations from Templates

One of my readers sent me this question:

Other than using Excel (and of course an automation tool) any suggestions for a tool to create device config for some 200 customer VRFs from a standard template?

You need three things to get the job done:

Failure Is Inevitable – Deal with It!

Last week a large European financial institution had a bad hair day. My friend Christoph Jaggi asked for my opinion, and I decided not to focus on the specific problem (that’s what post-mortems are for) but to point out something that’s often forgotten: don’t believe your system won’t fail, be prepared to deal with the failure.

Have to choose between VMware NSX and Cisco ACI? You’re Not Alone

I keep getting questions along the lines of “should I go with VMware NSX or should I deploy Cisco ACI” every single week, and as you know it’s hard to answer anything but it depends without spending hours on the topic.

That’s exactly what we plan to do in Zurich next Tuesday (May 16th) in a DIGS workshop that will run in parallel with the Data Center & Cloud Day (part of the SIGS Technology Conference).

Follow-up: Nexus-OS Dropping Configuration Commands

Not long after I published the let’s drop some configuration commands rant I got a very nice email from Nicolas Delecroix, Technical Marketing Engineer in Cisco INSBU, effectively saying “Would you have time for a short WebEx call to discuss the root cause of the problem and what we did to fix it?”

Of course I agreed and here’s what they told me:

What IPv6 Transition Mechanisms Are Actually Being Used?

An engineer watching my IPv6 Transition Mechanisms webinar sent me this question:

We would appreciate any insight you might have as to which transitional mechanisms the ISPs are actually deploying.

All of them ;)

Interim Forwarding Loops in OSPF or IS-IS Networks

One of my readers sent me this question (slightly rephrased):

Assume you have A,B and C connected in a triangle (with an alternate longer path to C). What happens if C loses its links to A and B? Won’t the traffic to C loop between A and B for a while?

As always, it depends.

Securing Network Automation Video Is Online

The awesome Troopers crew published conference videos, including my Securing Network Automation presentation (more, including slide deck).

What is VxRail?

One of my readers was considering Dell/EMC hyperconverged solutions and sent me this question:

Just wondering if you have a chance to check out VxRail.

I read the data sheet and spec sheet, but have never seen anyone using it (any real-life experience highly welcome – please write a comment).

Salt and SaltStack on Software Gone Wild

Ansible, Puppet, Chef, Git, GitLab… the list of tools you can supposedly use to automate your network is endless, and there’s a new kid on the block every few months.

In Episode 77 of Software Gone Wild we explored Salt, its internal architecture, and how you can use it with Mircea Ulinic, a happy Salt user/contributor working for Cloudflare, and Seth House, developer @ SaltStack, the company behind Salt.

Worth Reading: Who Moved My Control Plane?

Jordan Martin published a nice summary of what I’ve been preaching for years: centralized control plane doesn’t work (well) while controller-based network orchestration makes perfect sense.

While I totally agree with what he wrote he got the hype angle wrong:

Update: VMware NSX in Redundant L3-only Data Center Fabric

Short update for those that read the original blog post: it turns out that the answer to the question “Is it possible to run VMware NSX on redundantly-connected hosts in a pure L3 data center fabric?” is still NO.

VTEPs from different ESXi hosts can be in different subnets, but while a single ESXi host might have multiple VTEPs, the only supported way to use them is to put them in the same subnet. I removed the original blog post.

A huge thank you to everyone who pushed me with their comments and emails to find the correct answer.

Mini-RSA in Zurich, NSX, ACI, Automation…

I’ll be doing several on-site workshops in the next two months. Here’s a brief summary of where you could meet me in person.

A bit of manual geolocation first: if you’re from Europe, check out the first few entries, if you’re from US, there’s important information for you at the bottom, and if you don’t want to travel Europe or US, there’s an online course starting in September ;)

Figure Out What the Customer Really Needs

One of the toughest challenges you can face as a networking engineer is trying to understand what the customer really needs (as opposed to what they think they’re telling you they want).

For example, the server team comes to you saying “we need 5 VLANs between these 3 data centers”. What do you do?

Video: Routing on Hosts Deep Dive

Wondering how exactly routing on hosts works? Dinesh Dutt explained the details in this 10-minute video during the Leaf-and-Spine Fabric Designs webinar.

Optimize Data Center Infrastructure: Go with 10GE

I published the third installment of the Optimize Your Data Center Infrastructure story on my main web site. In this part I’m telling you to go with 10GE and consider 25GE.

Amazing Discovery: Stability Matters

Here’s an interesting blog post (particularly as it’s coming from a well-known cloud evangelist): at the infrastructure level stability matters more than agility or speed-of-deployment. Welcome to real world ;)

Automate Everything: ipSpace.net Is Coming Back to US

After the last US-based ipSpace.net workshop a lot of people asked me about the next one. It took a long time, but here it is: I’m running an on-site automation workshop together with several friends with outstanding hands-on experience in Colorado in late May.

Programmable ASICs on Software Gone Wild

During Cisco Live Europe 2017 (where I got thanks to the Tech Field Day crew kindly inviting me) I had a nice chat with Peter Jones, principal engineer @ Cisco Systems. We started with a totally tangential discussion on why startups fail, and quickly got back to flexible hardware and why one would want to have it in a switch.

Leaf-and-Spine Fabrics: Featured Webinar in April 2017

I recently finished editing the videos from the Leaf-and-Spine Designs update to the Leaf-and-Spine Fabrics webinar, so it wasn’t hard to select the featured webinar for April 2017. The featured videos now include BGP in the Data Center by Dinesh Dutt, SPB Deep Dive by Roger Lapuh, and VXLAN with EVPN control plane by Lukas Krattiger.

Starting with Network Automation

One of my readers considered joining the Building Network Automation Solutions course but wasn’t sure whether it would help him solve the challenges he’s facing in his network.

Fortunately, his challenges aren’t that hard to solve.

NETCONF Agent(s) on Cisco IOS XE 16.x

Evgeny made an interesting observation while testing the NETCONF client on IOS XE 16.x (see also this comment on my blog):

The most interesting part: for unknown reason IOS-XE gives different answers about capabilities on ports 830 and 22.

Einar quickly explained the mysterious behavior:

Worth Reading: Legacy software and Evolution

In case you’re wondering why we’re stuck with old stuff like TCP, IPv4, OSPF, and a few other bits and pieces that were invented decades ago when we could be using the glitzy controller-based software-defined whatever, read the blog post by Martin Sustrik. He talks about software, but we’re facing the same challenges in networking.

Video: Overlays in Data Center Fabrics

Lukas Krattiger (Cisco Systems) was the guest speaker in Layer-2+3 fabrics part of the Leaf-and-Spine Fabric Design webinar, and he started his presentation with an overview of how we use overlays in data center fabrics.

Network Automation Is Much More than Configuration Management

Most network automation presentations you can find on the Internet focus on configuration management, either to provision new boxes, or to provision new services, so it’s easy to assume that network automation is really a fancy new term for consistent device configuration management.

However, as I explained in the Network Automation 101 webinar, there’s so much more you can do and today I’d like to share a real-life example from Jaakko Rautanen, an alumni of my Building Network Automation Solutions online course.

Practice Your Wireshark-Fu with PCAP Challenges

Johannes Weber built a CCNP practice lab, configured 22 different protocols in it, and took packet captures of all of them happily chatting. To make things more interesting he created 45 challenges that you can solve with Wireshark using the pcap file he published.

Let’s Drop Some Random Commands, Shall We?

One of my readers sent me a link to CCO documentation containing this gem:

Beginning with Cisco NX-OS Release 7.0(3)I2(1), Cisco Nexus 9000 Series switches handle the CLI configuration actions in a different way than before the introduction of NX-API and DME. The NX-API and DME architecture introduces a delay in the communication between Cisco Nexus 9000 Series switches and the end host terminal sessions, for example SSH terminal sessions.

So far so good. We can probably tolerate some delay. However, the next sentence is a killer…

2017-05-08: The behavior is caused by an old bug in Linux TTY driver. Fixed NX-OS versions are planned to be shipped in late May 2017. More details here.

2017-04-05: The wonderful information disappeared from Cisco's documentation within 24 hours with no explanation whatsoever. However, I expected that and took a snapshot of that page before publishing the blog post ;)

Follow-up: Load Balancers and Session Stickiness

My Why Do We Need Session Stickiness in Load Balancing blog post generated numerous interesting comments and questions, so I decided to repost them and provide slightly longer answers to some of the questions.

Warning: long wall of text ahead.

NETCONF on Cisco Campus Switches on Software Gone Wild

During Cisco Live Europe (huge thanks to Tech Field Day crew for bringing me there) I had a chat with Jeff McLaughlin about NETCONF support on Cisco IOS XE, in particular on the campus switches.

We started with the obvious question “why would someone want to have NETCONF on a campus switch”, continued with “why would you use NETCONF and not REST API”, and diverted into “who loves regular expressions”. Teasing aside, we discussed:

Q&A: What Is a Hyperconverged Infrastructure?

I’m running a hyperconverged infrastructure event with Mitja Robas on April 6th, and so my friend Christoph Jaggi sent me a list of interesting questions, starting with:

What are hyperconverged infrastructures?

The German version of the interview is published on inside-it.ch.

Railroads and Cars: a Fairy Tale

Imagine a Flatworld in which railways are the main means of transportation. They were using horses and pigeons in the past, and experimenting with underwater airplanes, but railways won because they were cheaper than anything else (for whatever reason, price always wins over quality or convenience in that world).

As always, there were multiple railroad tracks and trains manufacturers, and everyone tried to use all sorts of interesting tricks to force the customers to buy tracks and trains from the same vendor. Different track gauges and heptagonal wheels that worked best with grooved rails were the usual tricks.

Securing Network Automation: Troopers 17 Presentation

Niki Vonderwell kindly invited me to Troopers 2017 and I decided to talk about security and reliability aspects of network automation.

The presentation is available on my web site, and I’ll post the link to the video when they upload it. An extended version of the presentation will eventually become part of Network Automation Use Cases webinar.

Cisco and Apple Agree: QoS Marking Is an Application Problem

The last presentation during the Tech Field Day Extra @ Cisco Live Europe event was a Cisco-Apple Partnership presentation, and we expected an hour of corporate marketese.

Can’t tell you how pleasantly surprised we were when Jerome Henry started his very technical presentation explaining the wireless goodies you get when using iOS with IOS.

Update: Virtual Switches in vSphere Environment

Just FYI: a week after I wrote this (don't forget to go through the comments), VMware made it official:

…we’ve found that VMware’s native virtual switch implementation has become the de facto standard for greater than 99% of vSphere customers today. … Moving forward, VMware will have a single virtual switch strategy that focuses on two sets of native virtual switch offerings – VMware vSphere® Standard Switch and vSphere Distributed Switch™ for VMware vSphere, and the Open virtual switch (OVS).

Video: SPB Deep Dive

During the Leaf-and-Spine Fabric Designs webinar Roger Lapuh from Avaya explained how Avaya uses SPB technology to build a L2+L3 fabric.

Updated: User Authentication in Ansible Network Modules

Ansible network modules (at least in the way they’re implemented in Ansible releases 2.1 and 2.2) were one of the more confusing aspects of my Building Network Automation Solutions online course (and based on what I’m seeing on various chat sites we weren’t the only ones).

I wrote an in-depth explanation of how you’re supposed to be using them a while ago and now updated it with user authentication information.

Why Do We Need Session Stickiness in Load Balancing?

One of the engineers watching my Data Center 3.0 webinar asked me why we need session stickiness in load balancing, what its impact is on load balancer performance, and whether we could get rid of it. Here’s the whole story from the networking perspective.

Two Switches Saga: Now in Text Format

Remember the All You Need Are Two Switches saga? Several readers told me they’d like to have in text (article) format, so I found a transcription service, and started editing what they produced and publishing it. The first two installments are already online.

On a related topic: we’ll discuss the viability of this approach in April DIGS event in Zurich, Switzerland.

Why Didn’t We Have Leaf-and-Spine Fabrics a Decade Ago?

One of my readers watched my Leaf-and-Spine Fabric Architectures webinar and had a follow-up question:

You mentioned 3-tier architecture was dictated primarily by port count and throughput limits. I can understand that port density was a problem, but can you elaborate why the throughput is also a limitation? Do you mean that core switch like 6500 also not suitable to build a 2-tier network in term of throughput?

As always, the short answer is it depends, in this case on your access port count and bandwidth requirements.

TCP in the Data Center and Beyond on Software Gone Wild

In autumn 2016 I embarked on a quest to figure out how TCP really works and whether big buffers in data center switches make sense. One of the obvious stops on this journey was a chat with Thomas Graf, Linux Core Team member and a founding member of the Cilium project.

Running vSphere on Cisco ACI? Think Twice…

When Cisco ACI was launched it promised to do everything you need (plus much more, and in multi-hypervisor environment). It was quickly obvious that you can’t do all that on ToR switches, and need control of the virtual switch (the real network edge) to get the job done.

To YANG or Not to YANG, That’s the Question

Yannis sent me an interesting challenge after reading my short “this is how I wasted my time” update:

We are very much committed in automation and use Ansible to create configuration and provision our SP and data center network. One of our principles is that we do rely solely on data available in external resources (databases and REST endpoints), and avoid fetching information/views from the network because that would create a loop.

You can almost feel a however coming in just a few seconds, right?

SDN Use Cases: Featured Webinar in March 2017

The featured webinar in March 2017 is the SDN Use Cases webinar describing over a dozen different real-life SDN use cases. The featured videos cover four of them: a data center fabric by Plexxi, microsegmentation (including VMware NSX), SDN-based Internet edge router built by David Barroso, and Fibbing - an OSPF-based traffic engineering developed at University of Louvain.

To view the videos, log into my.ipspace.net, select the webinar from the first page, and watch the videos marked with star.

Worth Reading: Building an OpenStack Private Cloud

It’s uncommon to find an organization that succeeds in building a private OpenStack-based cloud. It’s extremely rare to find one that documented and published the whole process like Paddy Power Betfair did with their OpenStack Reference Architecture whitepaper.

I was delighted to see they decided to do a lot of things I was preaching for ages in blog posts, webinars, and lately in my Next Generation Data Center online course.

Highlights include:

Video: Out-of-Band SDN Management Network

One of the challenges of designing a controller-based solution is the transport network used to exchange information between controller and controlled devices. Can you do that in-band or is it better to have an out-of-band network (built with traditional components)? Terry Slattery explained some of the pros and cons in the Monitoring SDN Networks webinar.

NETCONF Transactional Consistency on Cisco IOS XE

During the Tech Field Day Extra event at Cisco Live Europe 2017 Fabrizio Maccioni, Technical Marketing Engineer at Cisco, described enhanced programmability available in Cisco IOS XE release 16.x. What really got my attention was the claim that they made NETCONF on Cisco IOS transactional (and Fabrizio mentioned the candidate config and commit).

Here's my initial reaction:

Are You Ready for Building Next-Generation Data Center Course?

I often get questions from engineers wondering whether my webinars or courses would be too tough for them. Here’s a question I got from an engineer who wanted to attend my Building Next-Generation Data Center course: “What specific prior experience do you expect for this workshop?

The Ever-Increasing Complexity

Eyvonne Sharp wrote a great blog post describing Cisco’s love of complexity and how SD-WAN vendors proved things don’t have to be that complex.

I know Cisco (and every other vendor) loves making ever-more-complex solutions that lock you into their morass for the rest of your life (long-distance vMotion anyone?).

Worth Reading: Agile Development and Security

Matthias Luft (a good friend of mine, and a guest speaker in the upcoming Building Next-Generation Data Center course) wrote a great post about the (lack of) security in software development.

The parts I like most (and they apply equally well to networking):

CloudScale ASICs on Software Gone Wild

Last year Cisco launched a new series of Nexus 9000 switches with table sizes that didn’t match any of the known merchant silicon ASICs. It was obvious they had to be using their own silicon – the CloudScale ASIC. Lukas Krattiger was kind enough to describe some of the details last November, resulting in Episode 73 of Software Gone Wild.

For even more details, watch the Cisco Nexus 9000 Architecture Cisco Live presentation.

Nerd Knobs Save the Day: NSSA Saga Continues

Remember the OSPF NSSA Forwarding Address kludge and its consequences? Let’s figure out whether the nerd knobs available in Cisco IOS can save the day.

TL&DR: Don’t use OSPF areas if you can avoid them. Don’t use NSSA areas.

Guest Speakers in the Building Next-Generation Data Center Course

I managed to get another awesome lineup of guest speakers for the Spring 2017 Building Next-Generation Data Center course (starting in less than a month):

Scott Lowe will open the course with a presentation on the impact of open source software in data center environments.

Navigating Complex Data Structures in Ansible Playbooks

Have you ever tried to navigate complex data structures within Ansible playbooks using awkward looping constructs and convoluted map filters?

It might be easier to munge the data structure into a more appropriate format first and then use the munged data in subsequent tasks. Wondering how to do it?

Leaf-and-Spine Fabrics versus Fabric Extenders

One of my readers wondered what the difference between fabric extenders and leaf-and-spine fabrics is:

We are building a new data center for DR and we management is wanting me to put in recommendations to either stick with our current Cisco 7k to 2k ToR FEX solution, or prepare for what seems to be the future of DC in that spine leaf architecture.

Let’s start with “what is leaf-and-spine architecture?

Newer Docker Networking Options

In the last part of the free Docker Networking Fundamentals webinar Dinesh Dutt described the newer high-performance networking options (Macvlan and Ipvlan) introduced in Docker version 1.12.

Facebook Backpack Behind the Scenes

When Facebook announced 6-pack (their first chassis switch) my reaction was “meh” (as well as “I would love to hear what Brad Hedlund has to say about it”). When Facebook announced Backpack I mostly ignored the announcement. After all, when one of the cloud-scale unicorns starts talking about their infrastructure, what they tell you is usually low on detail and used primarily as talent attracting tool.

NextGenDC: Securing a Hybrid Cloud with Matthias Luft

Imagine you were asked to migrate some of the workloads running in your data center into a public (or managed) cloud. These workloads still have to access the data residing in your data center – a typical hybrid cloud deployment.

Next thing you know you have to deal with your (C)ISO and his/her usual concerns as well as the variety of articles on tech sites stating that "security is the biggest challenge of cloud adoption".

Network Automation and Undifferentiated Heavy Lifting

I got this tweet after publishing the “use Ansible to execute a single command on all routers” blog post (and a few similar comments on the blog post itself)

Or use Python, Netmiko and a simple For loop

I never cease to be amazed by the urge to do undifferentiated heavy lifting in the IT industry.

Q&A: Migrating to Modern Data Center Infrastructure

One of my readers sent me a list of questions after watching some of my videos, starting with a generic one:

While working self within large corporations for a long time, I am asking myself how it will be possible to move from messy infrastructure we grew over the years to a modern architecture.

Usually by building a parallel infrastructure and eventually retiring the old one, otherwise you’ll end up with layers of kludges. Obviously, the old infrastructure will lurk around for years (I know people who use this approach and currently run three generations of infrastructure).

OpenConfig: From Basics to Implementations

In 2013, large-scale cloud providers and ISPs decided they had enough of the glacial IETF process of generating YANG models used to describe device configuration and started OpenConfig – a customer-only initiative that quickly created data models covering typical use cases of the founding members (aka “What Does Google Need”).

More Thoughts on OSPF Forwarding Address

Angelos Vassiliou sent me an interesting lengthy email after I published my OSPF Forwarding Address series (part 1, part 2, part 3, part 4). I asked him whether it’s OK to publish his email together with my responses as a blog post and he gracefully agreed, so here it is.

EVPN: All that Glitters Is Not Gold

Cumulus Linux 3.2 shipped with a rudimentary EVPN implementation and everyone got really excited, including smaller ASIC manufacturers that finally got a control plane for their hardware VTEP functionality.

However, while it’s nice to have EVPN support in Cumulus Linux, the claims of its benefits are sometimes greatly exaggerated.

Use Ansible to Execute a Single Command on All Routers

I was using Ansible playbooks to configure Cisco IOS routers running in VIRL and wanted to extract the router configurations before stopping the simulation.

You can download the playbooks from my Github repository, and here’s how you can run Ansible with VIRL.

Network Automation 101: Featured Webinar in February 2017

The featured webinar in February 2017 is the Network Automation 101 webinar, and the featured video describes the reasons you should be interested in network automation, its basics, and the difference between automation and orchestration.

Video: Simplify BGP Configurations

Running BGP instead of an IGP in your leaf-and-spine fabric sounds like an interesting idea (particularly if your fabric is large). Configuring a zillion BGP knobs on every box doesn’t.

However, BGP doesn’t have to be complex. In the Simplify BGP Configurations video (part of leaf-and-spine fabric designs webinar) Dinesh Dutt explains how you can make BGP configurations simple and easy-to-understand.

The Unintended Consequences of NSSA Kludges

Remember the kludges needed to make OSPF NSSA areas work correctly? We concluded that saga by showing how the rules of RFC 3101 force a poor ASBR to choose an IP address on one of its OSPF-enabled interfaces as a forwarding address to be used in Type-7 LSA.

What could possibly go wrong with such a “simple” concept?

New Webinar: Automating Network Services

In the next session of Network Automation Use Cases webinar (on Thursday, February 16th) I’ll describe how you could implement automatic deployment of network services, and what you could do to minimize the impact of unintended consequences.

If you attended one of the previous sessions of this webinar, you’re already registered for this one, if not, visit this page and register.

And This Is Why Relying on Linux Makes Sense

Most networking operating systems include a mechanism to roll back device configuration and/or create configuration snapshots. These mechanisms usually work only for the device configuration, but do not include operating system images or other components (example: crypto keys).

Now imagine using RFC 1925 rule 6a and changing the “configuration rollback” problem into “file system snapshot” problem. That’s exactly what Cumulus Linux does in its newest release. Does it make sense? It depends.

Updated: Using Ansible Playbooks with Cisco VIRL

Some of the engineers building Ansible-with-VIRL lab in my Building Network Automation Solutions online course experienced interesting challenges, so I made the how-to instructions more explicit and added a troubleshooting section to the Using Ansible Playbooks with Cisco VIRL document. Hope you’ll find them useful.

Linux Networking Update from NetDev Conference on Software Gone Wild

When I recorded the first podcast with Thomas Graf we both found it so much fun that we decided to do it again. Thomas had attended the NetDev 1.2 conference so when we met in November 2016 we warmed up with What’s NetDev and then started discussing the hot new networking stuff being added to Linux kernel:

Why OSPF Needs Forwarding Address with NSSA Areas

In the previous blog posts I described how OSPF tries to solve some broken designs with Forwarding Address field in Type-5 LSA – a kludge that unnecessarily increases the already too-high complexity of OSPF.

NSSA areas make the whole thing worse: OSPF needs Forwarding Address in Type-5 LSAs generated from Type-7 LSAs to ensure optimal packet forwarding. Here’s why:

Managing Network Services Configuration with Ansible

In the last few weeks I’ve seen numerous questions along the lines of “how do I manage VLANs on my switch with Ansible”. You can look at this question from two perspectives: the low-level details (which modules do I use, how do I push commands to the box…) or the high-level challenges (how do I make sure actual device state matches desired device state). Obviously I’m interested in the latter.

Why Are High-Speed Links Better than Port Channels or ECMP

I’m positive I’ve answered this question a dozen times in various blog posts and webinars, but it keeps coming back:

You always mention that high speed links are always better than parallel low speed links, for example 2 x 40GE is better than 8 x 10GE. What is the rationale behind this?

Here’s the N+1-th answer (hoping I’m being consistent):

Increasing SDDC Visibility

In Episode 69 of Software Gone Wild we discussed ways of increasing visibility into VXLAN transport fabric. Another thing we badly need is visibility into the virtual edge behavior, and to help you get there Iwan Rahabok created a set of vRealize dashboards that include the virtual edge networking components. Hope you’ll find them useful.

To Drop or To Delay, That’s the Question on Software Gone Wild

A while ago I decided it's time to figure out whether it's better to drop or to delay TCP packets, and quickly figured out you get 12 opinions (usually with no real arguments supporting them) if you ask 10 people. Fortunately, I know someone who deals with TCP performance for living, and Juho Snellman was kind enough to agree to record another podcast.

Update 2017-03-31: Added More information section

OSPF Forwarding Address YAK: Take 2

In my initial OSPF Forwarding Address blog post I described a common Forwarding Address (FA) use case (at least as preached on the Internet): two ASBRs connected to a single external subnet with route redistributing configured only on one of them.

That design is clearly broken from the reliability perspective, but are there other designs where OSPF FA might make sense?

Using Ansible Networking Modules

One of the engineers attending my Building Network Automation Solutions online course got the lab up and running, wanted to execute a simple IOS command from an Ansible playbook and failed.

He quickly realized he needs to set connection to local; for more details read this article on my automation web site or watch the Ansible for Networking Engineers webinar.

New Webinar: PowerShell for Networking Engineers

Ansible (or Python+Paramiko/Netmiko) seems to be the tool used in most do-it-yourself network automation presentations and videos. Did you know there’s a scripting/automation alternative that’s hugely popular in parts of sysadmin and virtualization universe that almost nobody talks about in networking (because everyone is focused on huge data center fabrics and unicorns) – PowerShell (now also available on OSX and Linux).

Never Take Two Chronometers to Sea

One of the quotes I found in the Mythical Man-Month came from the pre-GPS days: “never go to sea with two chronometers, take one or three”, and it’s amazing the networking industry (and a few others) never got the message.

Linux CLI for Networking Engineers

One would think that we're the only ones struggling with Linux CLI (read: bash). Seems like cyber security professionals might be in the same boat according to the nice summary of dozens of Linux/bash commands collected by Robert Graham.

Multi-Host Container Networking

Running Linux containers on a single host is relatively easy. Building private multi-tenant networks across multiple hosts immediately creates the usual networking mess.

Fortunately the Socketplane team did a pretty good job; for more details watch the video from Docker Networking Fundamentals webinar or listen to the podcast I did with them a year ago.

OSPF Forwarding Address: Yet another Kludge

One of my readers sent me an interesting NSSA question (more in a future blog post) that sent me chasing for the reasons behind the OSPF Forwarding Address (FA) field in type-5 and type-7 LSAs.

This is the typical scenario for OSPF FA I was able to find on the Internet:

New Webinar: Automating Data Center Fabric Deployments

The next session of the Network Automation Use Cases series will take place on January 24th. Dinesh Dutt will explain describe how you can use Ansible and Jinja2 to automate data center fabric deployments, and I’ll have a few things to say about automating network security.

If you think that what Dinesh will talk about applies only to startups you’re totally wrong. UBS is using the exact same approach to roll out their new data centers; Thomas Wacker will share the details in his guest presentation in the next Building Next-Generation Data Centers online course.

It’s Security Ignorance, not Featuritis

A blog post by Russ White pointed me to an article describing how IPv6 services tend to be less protected than IPv4 services. No surprise there, people like Eric Vyncke and I were telling anyone who was willing to listen that operating two-protocol networks isn’t the same thing as operating a single-protocol one (see also RFC 1925 rule 4).

Worth Reading: the Mythical Man-Month

I was discussing a totally unrelated topic with Terry Slattery when he mentioned a quote from the Mythical Man-Month. It got me curious, I started exploring and found out I can get the book as part of my Safari subscription.

VXLAN Ping and Traceroute

From the moment Cisco and VMware announced VXLAN some networking engineers complained that they'd lose visibility into the end-to-end path. It took a long while, but finally the troubleshooting tools started appearing in VXLAN environment: NVO3 working group defined Fault Managemnet framework for overlay networks and Cisco implemented at least parts of it in recent Nexus OS releases.

You'll find more details in Software Gone Wild Episode 69 recorded with Lukas Krattiger in November 2016 (you can also watch VXLAN Technical Deep Dive webinar to learn more about VXLAN).

Parsing Printouts with Ansible Regular Expression Filters

Ansible is great at capturing and using JSON-formatted data returned by REST API (or any other script or method it can invoke), but unfortunately some of us still have to deal with network devices that cannot even spell structured data or REST.

Introduction to Docker: Featured Video of January 2017

The featured webinar in January 2017 is the Introduction to Docker webinar, and in the featured video Matt Oswalt explains the basic Docker tasks. Other videos in this webinar cover Docker images, volumes, networking, and Docker Compose and Swarm.

To view the featured video, log into my.ipspace.net, select the webinar from the first page, and watch the video marked with star.

Device Configurations Are Not a Good Source of Truth

One of my subscribers sent me this question after watching the second part of Network Automation Tools webinar (or maybe it was Elisa Jasinska's presentation in the Data Center course):

Elisa mentions that for a given piece of data, there should be “one source of truth”. It gets a bit muddled when you have an IPAM tool and Git source control simultaneously. It is not hard to imagine scenarios where these get out of sync especially if you consider multi-operator scenarios.

Confused? He provided a simple scenario:

Plans for 2017

With January 6th the Christmas/New Year holidays are over even for most European countries, so it’s time to restart my blog and set some goals for 2017.

Webinars

2015 was year of SDN, 2016 was year of network automation, and 2017 is shaping up to be the year of the cloud.