Tom Ammon sent me his thoughts on choosing the right level of abstraction in your network automation solution as a response to my What Is Intent-Based Networking blog post, and allowed me to publish them on ipspace.net.
I totally agree with your what vs how example with OSPF. I work on a NOS team where if we wanted, we could say, instead of “run OSPF on these links”, do this:
Roman Dodin wrote an article describing Nokia’s Ansible collection for SR Linux. Although I don’t use SR Linux (even though it was the first container supported by netlab ;), it was still very interesting to read about the design tradeoffs they had to make:
A senior engineer at Juniper Networks wasn’t happy with me mentioning resource hogs and Junos platforms in the same statement. Instead of engaging in never-ending angels dancing on pins deliberations comparing the virtues of Junos with other network operating systems, I decided to throw a bit of real-life data into the mix – I created a simple script that measures:
- The time it takes to execute vagrant up to start a single network device.
- The time it takes to deploy simple initial configuration on that device.
[Our network has] about 40 sites, but we don’t do total refresh cycles in bulk, just as needed. Everything we do is sporadic, and I’m trying to see the ROI on learning automation for things that are done once in a while that don’t take much time to do manually anyway.
There are two aspects to this part of his comment:
David Gee couldn’t resist making a few choice comments after I asked for his opinion of an early draft of the Network Automation Expert Beginners blog post, and allowed me to share them with you. Enjoy 😉
Network automation offers promises of reliability and efficiency, but it came without a warning label and health warnings. We seem to be perpetually stuck in a window display with sexily dressed mannequins.
I usually post links to my blog posts to LinkedIn, and often get extraordinary comments. Unfortunately, those comments usually get lost in the mists of social media fog after a few weeks, so I’m trying to save them by reposting them as blog posts (always with original author’s permission). Here’s a comment David Sun left on my Network Automation Expert Beginners blog post
The most successful automation I’ve seen comes from orgs who start with proper software requirements specifications and more importantly, the proper organizational/leadership backing to document and support said infrastructure automation tooling.
After reading the article, you might want to listen to the Salt and SaltStack podcast we did with Mircea a long while ago, and watch his presentation in Building Network Automation Solutions online course (also accessible with Expert Subscription).
Some network automation skeptics came to that place the hard way: they got burned by half-baked semi-tested systems. This is what one of my good friends had to say in a LinkedIn comment:
I am suspicious of automation, as I’ve unfortunately seen too many outages caused by either human error or faulty automation. Every time it required human CLI/GUI intervention to correct it. The problem is that the more automation we push, the fewer people know how to use the “old school” way to administer stuff.
Network automation is not the only IT discipline that could cause hard-to-correct errors requiring manual intervention. I’m positive everyone knows at least one horror story resulting in manual tweaking of the Windows registry, or a sequence of arcane SQL commands1.
Here’s some common-sense view on hard-coded rules versus machine learning in network operations by Mark Seery – quite often we can specify our response to an event as a simple set of rules, but if we want to identify deviation from “normal” behavior, machine learning might not be a bad idea.
I keep getting questions along the lines of “is network automation practical/a reality?” with arguments like:
Many do not see a value and are OK with just a configuration manager such as Arista CVP (CloudVision Portal) and Cisco DNA.
Configuration consistently is a huge win regardless of how you implement it (it’s perfectly fine if the tools your vendor providers work for you). It prevents opportunistic consistency, as Antti Ristimäki succinctly explained:
I really don’t see how a network any larger and more complex than a small and simple enterprise or campus network can be developed and engineered in a consistent manner without full automation. At least routing intensive networks might have very complex configurations related to e.g. routing policies and it would be next to impossible to configure them manually, at least without errors and in a consistent way.
I could spend days writing riffs on some of the more creative (in whatever dimension) comments left on my blog post or LinkedIn1. Here’s one about uselessness of network automation in cloud infrastructure (take that, AWS!):
If the problem is well known you can apply rules to it (automation). The problem with networking is that it results in a huge number of cases that are not known in advance. And I don’t mean only the stuff you add/remove to fix operational problems. A friend in one of the biggest private clouds was saying that more than 50% of transport services are customized (a static route here, a PBR there etc) or require customization during their lifecycle (e.g. add/remove a knob). Telcos are “worse” and for good reasons.
Yeah, I’ve seen such environments. I had discussions with a wide plethora of people building private and public (telco) clouds, and summarized the few things I learned (not many of them good) in Address the Business Challenges First part of the Business Aspects of Networking Technologies webinar.
A few months ago, Urs Baumann created NetTowel, a very nice CLI wrapper around several popular libraries, including Jinja2, TTP, NetMiko and netaddr. Although it seems he got busy with other things in recent months, and the development stalled a bit, the tool is definitely worth exploring.
Some of the blog comments never cease to amaze me. Here’s one questioning the value of network automation:
I think there is a more fundamental reason than the (in my opinion simplistic) lack of skills argument. As someone mentioned on twitter
“Rules make it harder to enact change. Automation is essentially a set of rules.”
We underestimated the fact that infrastructure is a value differentiator for many and that customization and rapid change don’t go hand in hand with automation.
Whenever someone starts using MBA-speak like value differentiator in a technical arguments, I get an acute allergic reaction, but maybe he’s right.
Responding to my Infrastructure as Code Sounds Scary blog post, Deepak Arora posted an interesting (and unfortunately way too accurate) list of challenges you might encounter when trying to introduce network automation in an enterprise environment.
He graciously allowed me to repost his thoughts on my blog.
Why don’t we agree on that :