TL&DR: Ansible might decide to reorder list values in a loop parameter, resulting in unexpected order of execution and (in my case) totally borked device configuration.
A bit of a background first: I’m using an Ansible playbook within netlab to deploy initial device configurations. Among other things, that playbook deploys configuration snippets for numerous configuration modules, and the order of deployment is absolutely crucial. For example, you cannot activate BGP neighbors in Labeled Unicast (BGP-LU) address family (mpls module) before configuring BGP neighbors (bgp module).
Yesterday I wrote a frustrated tweet after wasting an hour trying to figure out why a combination of OSPF and IS-IS routing worked on Cisco IOS but not on Nexus OS. Having to wait for a minute (after Vagrant told me SSH on Nexus 9300v was ready) for NX-OS to “boot” its Ethernet module did’t improve my mood either, and the inconsistencies in NX-OS interface naming (Ethernet1/1 is uppercase while loopback0 and mgmt0 are lowercase) were just the cherry on top of the pile of ****. Anyway, here’s what I wrote:
Can’t tell you how much I hate Ansible’s lame attempts to do idempotent device configuration changes. Wasted an hour trying to figure out what’s wrong with my Nexus OS config… only to find out that “interface X” cannot appear twice in the configuration you want to push.
Not unexpectedly, I got a few (polite and diplomatic) replies from engineers who felt addressed by that tweet, and less enthusiastic response from the product manager (no surprise there), so it’s only fair to document exactly what made me so angry.
Recording the same content for the third time because software developers decided to write code before figuring out what needs to be done is disgusting… so it took me a long long while before I collected enough willpower to rewrite and retest all the examples and re-record the Getting Operational Data section of Ansible for Networking Engineers webinar.
The new videos explain how to consume data generated by show commands in JSON or XML format, and how to parse the traditional text-based show printouts. I dropped mentions of (semi)failed experiments like Ansible parse_cli and focused on things that work well: TextFSM, in particular with ntc-templates library, pyATS/Genie, and TTP. On the positive side, I liked the slick new cli_parse module… let’s hope it will stay that way for at least a few years.
On a totally unrelated topic, I realized (again) that fail fast, fail often sounds great in a VC pitch deck, and sucks when you have to deal with its results.
One of the attendees of my Building Network Automation Solutions online course quickly realized a limitation of Ansible (by far the most popular network automation tool): it stores all the information in random text files. Here’s what he wrote:
I’ve been playing around with Ansible a lot, and I figure that keeping random YAML files lying around to store information about routers and switches is not very uh, scalable. What’s everyone’s favorite way to store all the things?
He’s definitely right (and we spent a whole session in the network automation course discussing that).
One of our subscribers sent me this email when trying to use ideas from Ansible for Networking Engineers webinar to build BGP route reflector configuration:
I’m currently discovering Ansible/Jinja2 and trying to create BGP route reflector configuration from Jinja2 template using Ansible playbook. As part of group_vars YAML file, I wish to list all route reflector clients IP address. When I have 50+ neighbors, the YAML file gets quite unreadable and it’s hard to see data model anymore.
Whenever you hit a roadblock like this one, you should start with the bigger picture and maybe redefine the problem.
One of the first roadblocks you’ll hit in your “let’s master Ansible” journey will be a weird error deep inside a Jinja2 template. Can we manage that complexity somehow… or as one of the participants in our Building Network Automation Solutions online course asked:
Is there any recommendation/best practices on Jinja templates size and/or complexity, when is it time to split single template into function portions, what do you guys do? And what is better in terms of where to put logic - into jinja or playbooks
One of my friends described the challenge as “Debugging Ansible is one of the most terrible experiences one can endure…” and debugging Jinja2 errors within Ansible playbooks is even worse, but there are still a few things you can do.
Some network devices return structured data in either text- or XML format (but cannot spell JSON). Ansible prefers getting JSON-formatted data, and has a number of filters to process text printouts… but what could you do if you want to work with XML documents within Ansible? I described a few solutions in Transforming XML Data in Ansible.
Here’s an interesting tidbit from “Last Week in AWS” blog:
From a philosophical point of view, AWS fundamentally considers an API to be a promise. Services that aren’t promoted anymore are still available […] Think about that for a second - a service launched 13 years ago is still actively supported to the point where you can use it today.
A month ago I described how Paddy Kelly used pyATS to get VRF data from a Cisco router to create per-VRF connectivity graphs.
Recently he also wrote a short article describing how to get started with pyATS and Ansible. Thank you, Paddy!
Imagine you want to create a Jinja2 report that includes only a select subset of elements of a data structure… and want to have header, footer, and element count in that report.
Traditionally we’d solve that challenge with an extra variable, but as Jinja2 variables don’t survive loop termination, the code to do that in Jinja2 gets exceedingly convoluted.
Fortunately, Jinja2 provides a better way: using a conditional expression to select the elements you want to iterate over.
Have you ever seen an Ansible playbook where 90% of the code prepares the environment, and then all the work is done in a few template and assemble modules? Here’s an alternative way of getting that done. Is it better? You tell me ;)
I had a fantastic chat with David Bombal a while ago in which we covered tons of network automation topics including “should I use Nornir or NAPALM or Netmiko?”
The only answer one can give would be “it depends… on what you’re trying to do” as these three tools solve completely different challenges.
Paramiko is SSH implementation in Python. It’s used by most Python tools that want to use SSH to connect to other hosts (including networking devices).
As I was preparing the materials for Ansible 2.7 Update webinar sessions I wanted to dive deeper into declarative configuration modules, starting with “I wonder what’s going on behind the scenes”
No problem: configure EEM applet command logging on Cisco IOS and execute an ios_interface module (more about that in another blog post)
Next step: let’s see how multi-platform modules work. Ansible has net_interface module that’s supposed to be used to configure interfaces on many different platforms significantly simplifying Ansible playbooks.
I always love to hear from networking engineers who managed to start their network automation journey. Here’s what one of them wrote after watching Ansible for Networking Engineers webinar (part of paid ipSpace.net subscription, also available as an online course).
This webinar helped me a lot in understanding Ansible and the benefits we can gain. It is a big area to grasp for a non-coder and this webinar was exactly what I needed to get started (in a lab), including a lot of tips and tricks and how to think. It was more fun than I expected so started with Python just to get a better grasp of programing and Jinja.
In early 2019 we made the webinar even better with a series of live sessions covering new features added to recent Ansible releases, from core features (loops) to networking plugins and new declarative intent modules.