Blog Posts in November 2017
Automate End-to-End Latency Measurements
Here’s another idea from the Building Network Automation Solutions online course: Ruben Tripiana decided to implement a latency measurement tool. His playbook takes a list of managed devices from Ansible inventory, generates a set of unique device pairs, measures latency between them, and produces a summary report (see also his description of the project).
BGP as a Better IGP? When and Where?
A while ago I helped a large enterprise redesign their data center fabric. They did a wonderful job optimizing their infrastructure, so all they really needed were two switches in each location.
Some vendors couldn’t fathom that. One of them proposed to build a “future-proof” (and twice as expensive) leaf-and-spine fabric with two leaves and two spines. On top of that they proposed to use EBGP as the only routing protocol because draft-lapukhov-bgp-routing-large-dc – a clear case of missing the customer needs.
Security or Convenience, That’s the Question
One of my readers was so delighted that something finally happened after I wrote about a NX-OS bug that he sent me a pointer to another one that has been pending for a long while, and is now officially terminated as FAD (Functions-as-Designed… even documented in the Further Problem Description).
Here’s what he wrote (slightly reworded)…
It’s Bash Scripts All the Way Down (more on CLI versus API)
Netfortius made an interesting comment to my Ansible playbook as a bash script blog post:
Ivan - aren't we now moving the "CLI"[-like] approach, upstream (the one we are just trying to depart, via the more structured and robust approach of RESTAPI).
As I explained several times, I don’t know where the we must get rid of CLI ideas are coming from; the CLI is root of all evil mantra is just hype generated by startups selling alternative approaches (the best part: one of them was actually demonstrating their product using CLI).
New Video: Whitespace Handling in Jinja2
Whitespace handling is one of the most confusing aspects of Jinja2, thoroughly frustrating many attendees of my Ansible and Network Automation online courses.
I decided to fix that, ran a few well-controlled experiments, and documented the findings and common caveats in Whitespace Handling in Jinja2 video.
Video: Using Simple PowerShell Scripts
After explaining the basics of PowerShell, Mitja Robas described how to do implement the “Hello, World!” of network automation (collecting printouts from network devices) in PowerShell.
To watch all videos from this free webinar, register here.
You’ll need at least free ipSpace.net subscription to watch the video.
Let’s Pretend We Run Distributed Storage over a Thick Yellow Cable
One of my friends wanted to design a nice-and-easy layer-3 leaf-and-spine fabric for a new data center, and got blindsided by a hyperconverged vendor. Here’s what he wrote:
We wanted to have a spine/leaf L3 topology for an NSX deployment but can’t do that because the Nutanix servers require L2 between their nodes so they can be in the same cluster.
I wanted to check his claims, but Nutanix doesn’t publish their documentation (I would consider that a red flag), so I’m assuming he’s right until someone proves otherwise (note: whitepaper is not a proof of anything ;).
Feedback: Ansible for Networking Engineers
Got this feedback on my Ansible for Networking Engineers webinar:
This webinar is very comprehensive compared to any other Ansible webinars available out there. Ivan does great job of mapping and using real life example which is directly related to daily tasks.
The Ansible online course is even better: it includes support, additional hands-on exercises, sample playbooks, case studies, and lab instructions.
However, Ansible is just a tool that shouldn’t be missing from your toolbox. If you need a bigger picture, consider the Building Network Automation Solutions online course (and register ASAP to save $700 with the Enthusiast ticket).
Why Does It Take So Long to Upgrade Network Devices?
One of my readers sent me a question about his favorite annoyance:
During my long practice, I’ve never seen an Enterprise successfully managing the network device software upgrade/patching cycles. It seems like nothing changed in the last 20 years - despite technical progress, in still takes years (not months) to refresh software in your network.
There are two aspects to this:
Worth Reading: Classifying Route Leaks
Did you know there’s an RFC describing typical BGP route leaks? I didn’t until I stumbled upon this blog post.
Optimize Data Center Infrastructure: Build an Optimized Fabric
I published the last part of my Optimize Data Center Infrastructure series: build an optimized data center fabric.
To learn more about data center fabric designs, check the new online course or enroll into the Spring 2018 session of Building Next-Generation Data Center course.
Pluribus Networks… 2 Years Later
I first met Pluribus Networks 2.5 years ago during their Networking Field Day 9 presentation, which turned controversial enough that I was advised not to wear the same sweater during NFD16 to avoid jinxing another presentation (I also admit to be a bit biased in those days based on marketing deja-moo from a Pluribus sales guy I’d been exposed to during a customer engagement).
Pluribus NFD16 presentations were better; here’s what I got from them:
Run Well-Designed Experiments to Learn Faster
I know that everyone learns in a slightly different way. Let me share the approach that usually works well for me when a tough topic I’m trying to master includes a practical (hands-on) component: running controlled experiments.
Sounds arcane and purely academic? How about a simple example?
A week ago I talked about this same concept in the Building Network Automation Solutions online course. The video is already online and you get immediate access to it (and the rest of the course) when you register for the next live session.
Another Reason to Run Linux on Your Data Center Switches
Arista’s OpenFlow implementation doesn’t support TLS encryption. Usually that’s not a big deal, as there aren’t that many customers using OpenFlow anyway, and those that do hopefully do it over a well-protected management network.
However, lack of OpenFlow TLS encryption might become an RFP showstopper… not because the customer would really need it but because the customer is in CYA mode (we don’t know what this feature is or why we’d use it, but it might be handy in a decade, so we must have it now) or because someone wants to eliminate certain vendors based on some obscure missing feature.
New Dates for the Building Network Automation Solutions Online Course
We’re slowly wrapping up the autumn 2017 Building Network Automation Solutions online course, so it’s time to schedule the next one. It will start on February 13th and you can already register (and save $700 over regular price as long as there are Enthusiast tickets left).
Do note that you get access to all course content (including the recordings of autumn 2017 sessions) the moment you register for the course. You can also start building your lab and working on hands-on exercises way before the course starts.
Things that cannot go wrong
Found this Douglas Adams quote in The Signal and the Noise (a must-read book):
The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair
I’ll leave to your imagination how this relates to stretched VLANs, ACI, NSX, VSAN, SD-WAN and a few other technologies.
Video: Separate Data from Code
After explaining the challenges of data center fabric deployments, Dinesh Dutt focused on a very important topic I covered in Week#3 of the Building Network Automation Solutions online course: how do you separate data (data model describing data center fabric) from code (Ansible playbooks and device configurations)
Create a VLAN Map from Network Operational Data
It’s always great to see students enrolled in Building Network Automation Solutions online course using ideas from my sample playbooks to implement a wonderful solution that solves a real-life problem.
James McCutcheon did exactly that: he took my LLDP-to-Graph playbook and used it to graph VLANs stretching across multiple switches (and provided a good description of his solution).
DMVPN or Firewall-Based VPNs?
One of my readers sent me this question:
I'm having an internal debate whether to use firewall-based VPNs or DMVPN to connect several sites if our MPLS connection goes down. How would you handle it? Do you have specific courses answering this question?
As always, the correct answer is it depends, in this case on:
Update: Cisco Nexus Switches
Third vendor in this year’s series of data center switching updates: Cisco.
As expected, Cisco launched a number of new switches in 2017, and EOL’d older models … for pretty varying value of old. For example, most of the original Nexus 9300 models are gone.
… updated on Thursday, December 15, 2022 10:07 UTC
The Three Paths of Enterprise IT
Everyone knows that Service Providers and Enterprise networks diverged decades ago. More precisely, organizations that offer network connectivity as their core business usually (but not always) behave differently from organizations that use networking to support their core business.
Obviously, there are grey areas: from people claiming to be service providers who can’t get their act together, to departments (or whole organizations) who run enterprise networks that look a lot like traditional service provider networks because they’re effectively an internal service provider.
Where Does Automation Fit into Enterprise IT?
One of my readers coming from system development area asked a fundamental question about the role of automation in enterprise IT (somewhat paraphrased):
[In system development] we automate typical tasks from the pre-defined task repository, so I would like to understand broader context as the automation (I guess) is just a part of the change we want to do in the system. Someone needs to decide what to do, someone needs to accept the change and finally the automation is used.
Of course he’s absolutely right.