Implementing 'Undo' Functionality in Network Automation
Kurt Wauters sent me an interesting challenge: how do we do rollbacks based on customer requests? Here’s a typical scenario:
You might have deployed a change that works perfectly fine from a network perspective but broke a customer application (for example, due to undocumented usage), so you must be able to return to the previous state even if everything works. Everybody says you need to “roll forward” (improve your change so it works), but you don’t always have that luxury and might need to take a step back. So, change tracking is essential.
He’s right: the undo functionality we take for granted in consumer software (for example, Microsoft Word) has totally spoiled us.
Unfortunately, there’s no silver bullet unless you describe your network in YAML files, store them in a Git repository, and use git revert to undo changes. The moment you want to have a fancier data store, be it a relational database, key-value store, or a NoSQL database, you have to implement the change tracking in whatever application is manipulating the data1.
Network automation is not the only domain with that particular problem2. Have you ever tried to undo a grocery store sale or unorder a pizza? How about undoing a bank transfer?3 In all those cases, the changes must be undone manually (or by the application) and logged as yet another transaction.
We have to use the same approach in any system that stores data in something more structured than text files under version control:
- Keep a separate log of changes made during user transactions4.
- Have a way to request undoing a transaction. This process should include sanity checks so you don’t mess up the intervening changes and should reject the undo operation if those checks fail.
- Treat every undo transaction as a regular transaction that triggers deployment of changed network configuration.
Considering all that, maybe it’s worth staying with YAML files and Git for a while longer ;)
-
In theory, you could get the best of Git and relational databases with Dolt. Still, it’s probably great fun trying to undo a transaction that touched data that was subsequently partially modified by other transactions. ↩︎
-
IPAM tools are no better; the only exception might be Nautobot with Dolt plugin. ↩︎
-
Having sender and recipient accounts in different banks makes it even more fun. ↩︎
-
Most relational databases have no built-in change logging. You could get fancy and use triggers to log changes, but it’s messy and platform-dependent. It’s much better to solve the problem in the application. ↩︎
I wonder if software solutions provided by vendors will be able to resolve some of the issues mentioned. For e.g. Juniper's Paragon offers a complete view of the changes applied as part of the deployment to each device where the change was applied, and can be undone in a single shot by simply (for the lack of a better word) deleting the deployment. Vaguely remember Cisco NSO supporting rollback too, albeit on a device by device basis. Not sure if there is a vendor neutral solution out there.
No. Configuration change management is (relatively) trivial and can be done with a vendor tool, an open-source tool like Oxidized, or a home-brewed solution using something to grab config files and Git.
NSO supports the rollback of failed multi-device transactions (because it cannot reach a device or because a device refuses to implement the change). Not sure you can do the true "undo".
And of course there's a vendor-neutral solution -- YAML / Git / Jinja2 / Ansible ;)