Network Automation RFP Requirements
After finishing the network automation part of a recent SDN workshop I told the attendees “Vote with your wallet. If your current vendor doesn’t support the network automation functionality you need, move on.”
Not surprisingly, the next question was “And what shall we ask for?” Here’s a short list of ideas, please add yours in comments.
Programmable Interface (API)
The device MUST have an on-device programmable API (NETCONF or REST) that allows an external script to:
- Get device configuration
- Get operational data
- Change device configuration.
I don’t want to hear about “solutions” that insert layers of kludges between my script and the device I want to manage. If I can’t access the device itself using NETCONF or REST I’m no longer interested. After all, my calendar is showing 2016.
Pass: Most networking vendors, at least in recent software releases.
Fail: List your grievances in the comments ;)
Structured Operational Data
The device MUST return operational data as structured data (JSON or XML format) not as text printouts wrapped in XML or JSON envelopes.
I had enough of screen scraping in the 30 years I had to deal with networking devices. I don’t want to write another Expect script or TextFSM definition. My calendar is still showing 2016.
Pass: Junos, Nexus OS, Arista EOS, Brocade VDX, ALU/Nokia
Fail: Cisco IOS
Nice try: Cisco IOS XE with REST API (it returns a minimalistic set of operational data, see also feature parity below).
Device Configuration in Structured Format
The device SHOULD return its configuration in structured format (JSON or XML) with meaningful structure (for example, ACL lines should be within the ACL).
I don’t know why I should write another configuration scraping program to figure out what BGP neighbors a device has if I could do the same thing with a simple walk down the return object. I had enough Perl Regexps for one life.
Pass: Junos, ALU/Nokia, Cisco IOS XE release 16.
Mostly there: Cisco IOS and IOS-XE (prior to release 16).
Atomic configuration changes
Changes to device configuration MUST be atomic, more so if the device supports NETCONF – either all the submitted changes are accepted or none is.
I really don’t care if I can get that done in a NETCONF session with commit capability or as a single huge REST call, but I don’t want to be cut off the box once again because the box accepted only half the ACL.
Pass: Junos, IOS XR, Arista EOS
Almost: Cisco IOS XE. REST interface is atomic within a single call, as is NETCONF implementation in release 16.x which implements rollback-on-error.
Fail: Cisco IOS, Nexus OS
Configuration Rollback
The device MUST support rollback to a previous configuration.
If I made a mistake, I want to be able to go back to a previous configuration without spending hours hand-crafting the differences between the mess I made and the configuration that worked before I started messing it up.
Pass: Junos, IOS XR, Arista EOS, Cisco IOS, Nexus OS, ALU/Nokia
Configuration Replace
The device MUST support replacing current configuration with a new configuration without a reload.
Sometimes I really don’t want to waste my time calculating the differences that have to be made to get the device to do what I want, particularly when I create the whole configuration with a template.
Pass: Junos, IOS XR, Arista EOS, Cisco IOS/XE, Nexus OS
Configuration diff
The device SHOULD be able to create a list of configuration commands needed to transform one configuration into another.
It’s great if you can point out the differences between two configurations to the engineer who has to approve the change. Oh, and I’m looking for the list of commands to get from A to B. I can run a diff on Linux myself.
Pass: Junos, Cisco IOS
Fail: Most everyone else. Many platforms use standard Linux diff instead of considering configuration context.
Support for Industry-Standard Models
The device SHOULD support industry-standard configuration data models (IETF and/or OpenConfig).
We waited long enough to get them. I don’t want to wait another decade for the vendors to implement them.
Pass: Junos, Arista EOS (OpenConfig), Nexus OS (OpenConfig), IOS XE (IETF), IOS XR (OpenConfig)
Warning: While most vendors support some industry standard, always check out what can be configured through the standard models.
Feature Parity
Paraphrasing Ron Broersma: All functionality requested in the RFP must be fully supported by the device API and meet the above requirements.
Anything Else?
I probably forgot a few critical requirements. Please list them in the comments.
Want to Know More?
Check out the network automation webinars and the Building Network Automation Solutions online course.
Revision History
- 2023-03-12
- Removed all references to Brocade VDX which has been obsolete for years.
A.) Support "CM" tools that were created for server systems and their state, not configuration
B.) Use these tools to provide "CM" for configuration - which is what you are mostly concerned with on network devices.
C.) Idempotency for configuration, without context, is much harder than idempotency for state - i.e. is service X running, is pkg Y installed, etc.
D.) As Leke mentioned, scaling the disposition and logistics of these agents is very tough when you get into the thousands of devices.
All that being said, there are several vendors that support both Puppet and Chef in one form or another. Junos supports them on all but the SRX and PTX platforms as of their latest release.
Both Ansible and Saltstack are much better ways of attacking network state/configuration management than Puppet and Chef. Ansible doesn't require agents at all and SaltStack has a great proxy system that allows for their proxy agents to work without much configuration at all and work against any device you want it to, regardless of vendor. You just have to have the modules - the same way you do for Ansible anyway.
This is a case of missing the forest for the trees. You are narrowing down your focus for automation based on the set criteria of a product. And, that product wasn't even meant to do what you want in the first place. To really get the most out of automation, what Ivan has put together is a base set of must haves or don't even attempt it. If you are not looking at the capabilities of the vendor OS from the perspective of what you need to accomplish business (not technical) objectives via automation, then you are missing said forest.
* Seconding Donatas above - network OSs have to support standard CM clients not just in powerpoint (even though cumulus does work with standard chef client)
* Network OSs should have x86 virtual replicas of major hw products. It's about time network engineers had their own dev and staging environments, just like programmers
Unfortunately ENT campus/DC are dominated by other vendors
If the dominant vendor doesn't support the features you need, don't support them.
Also we are talking about devops, telemetry is to be supported with support of http push and at least one protocol for asynchronous messaging (amqp, xmpp, mqtt, Kafka...)
ü The device should lend itself to virtualization, as deployed in production - so, if it is a firewall, it should be 'virutalizable' with all production features (multi-context, transparent firewall etc)
ü The device should be created with an api-first approach (especially if it is a closed vendor). If there is a feature on the product, it should be accessible via an API
ü If the enterprise intends to manage the device using a centrallized controller (BigIQ, CSM etc), every feature on that management platfrom should be available as a north-bound API, consumable by automation tools
====
However, since then, I have been rethinking the API piece. Given that we look up to our *nix pioneers as standard bearers for system automation, why do we demand it? I am now more inclined to think, that the API mandate should only be if the vendor OS is a closed system. If an open system vendor, creates APIs for applications running on their system (say for BGP configs) - kudos to them, but no longer think that should be mandated. Something like Ansible could be the 'API broker' for higher level workflow tools, to interact with the services on that platform....
Thoughts?
Cisco does support retrieval of Structured Operational data on IOS-XR and Nexus platforms in the recent releases. The operational data can be streamed out from the router and received by a client with a push model, rather than the pull model normally supported with SNMP. The telemetry stream can formatted in JSON, Google Protocol Buffers or Google KeyValue Protocol Buffer formats. The streaming telemetry is supported using the non-proprietary Open Config Telemetry model for subscribing to the operational data that the user is interested in. Most of the Open Config models are supported and Cisco native models are supported for other areas that don't have either OC models or the OC models are still being worked out. The subscriptions can themselves be made over Netconf/XML or Google RPC session to the router.
Cisco also supports structured configuration data by the way of ITEF/Open Config/Cisco Native Yang Models over a Netconf session.
Cisco also supports Google RPC mechanism to push a config change structured as a JSON object to the router.
Cisco also has built and open sourced a framework called YDK (Yang Development Kit) that allows a user to compile the yang models into objects in a language like python (other language bindings are being worked on). The user is then able to manipulate the config on the router by programmatically setting attributes on the config objects and performing a CRUD operation to write the data to the router to have the config take affect.
Thanks for a marketing manifesto ;) If you'd have shared your contact details or contacted me offline, we could add IOS XR to the lists. Alas...
Now for the details:
"Cisco is big time into Model Driven Manageability" << what counts for me is what's shipping and documented. Big-time statements and visions are nice, executing on them is even better.
"Retrieval of Structured Operational data on IOS-XR and Nexus" << Nexus OS is in the list. See above for IOS-XR.
Streaming telemetry - interesting, but not the topic of this blog post.
Open Config and IETF models - mentioned.
Structured configuration data - Cisco has at least four different network operating systems, so please specify which one(s) support it. The last time I checked Nexus OS didn't even have "get-config" NETCONF command. I know that has been added, but I haven't tested what it returns yet. Checking how XML configuration looks in latest versions of IOS XE is already on my to-do list.
Looks like you are are paid shill for Brocade based on the quote earlier in your blog "The Pass/Fail information included below was collected to the best of my knowledge with extensive help from Jason Edelman, Nick Buraglio, David Barroso and several Brocade engineers (THANK YOU!)." .
This is the last post from me.
This says a lot about Architecture and Standards in Vendors...if you can't get it right within a vendor, how are you going to adapt to market standards.
DISCLAIMER: I am not represent/work for any vendor of the equipment. It is just my experience. If any of my thoughts are not right, may be I have used not right tool for this or I have not enough information about the vendor devices, because it is not publicly available.
So having netconf is good. But devices also must publish all the configuration modules, as the capabilities in the netconf hello messages. I have tried IOS-XR and JunOS. It is not true f.e. for JunOS. But works for IOS-XR.
The next key thing is the ability to get YANG modules out from the device for declared capabilities (ietf-netconf-monitoring RFC6022). it works well for both JunOS and IOS-XR.
And the last but not least. The obtained YANG modules must be able to be compiled ;-). F.e. with publicly available pyang, but more than that better with "pyang --ietf". So this not true for both of them IOS-XR for several modules. And for JunOS YANG models gotten out from device.
So industry is heading further for the bright future and this good for all of us. But clear some marketing hype sometimes also very important.