[FATAL] Ansible Release 12.0 Breaks netlab Jinja2 Templates

On September 9th, the ansible release 12.0 appeared on PyPi. It requires ansible-core release 2.19, which includes breaking changes to Jinja2 templating. netlab Jinja2 templates rely on a few Ansible Jinja2 filters; netlab thus imports and uses those filters, and it looks like those imports pulled in the breaking changes that consequently broke the netlab containerlab configuration file template (details).

netlab did not check the Ansible core version (we never had a similar problem in the past), and the installation scripts did not pin the Ansible version (feel free to blame me for this one), which means that any new netlab installation created after September 9th crashed miserably on the simplest lab topologies.

This is the workaround we implemented in netlab release 25.09-post1 (released earlier today):

  • The netlab command checks the Ansible core version and refuses to run with Ansible core 2.19 or greater.
  • The only exception to the above rule is the netlab install command, which is the recommended mechanism for downgrading Ansible in simple installations.
  • The installation script used by the netlab install ansible command pins the Ansible release to 11.10 or lower.

I also added more information to the netlab version command to simplify troubleshooting of similar issues.

Potential Root Cause

I’m suspecting this is the root cause of the crashes we’re experiencing:

templating - Access to _ prefixed attributes and methods, and methods with known side effects, is no longer permitted. In cases where a matching mapping key is present, the associated value will be returned instead of an error. This increases template environment isolation and ensures more consistent behavior between the . and [] operators.

netlab heavily uses _ prefixed attributes for internal data that is not checked against the lab topology schema. The netlab data transformation code computes some of those attributes, which are later used in the device configuration templates. Sometimes these attributes are not defined, so we’re using the |default() filter on them, and that seems to trigger Jinja2 templating errors. Every use of _ prefixed attribute with Ansible release 12.0 is thus a ticking bomb.

What’s Next?

Here are my early ideas on what to do next (they will probably change as we discuss them):

  • Keep the netlab-installed Ansible version pinned to release 11.10 (or lower) for the foreseeable future.
  • Remove the dependency on Ansible filters used in netlab templates (primarily the ipaddr filter), either using another library like j2ipaddr or writing our own filters.

Eventually, we’ll have to bite the bullet and figure out how to handle device configuration templates. We could:

  • Create device configurations with netlab and use Ansible solely to push them to network devices, or
  • Thoroughly check all our Ansible playbooks and Ansible-rendered device configuration templates.

Finally, I don’t blame the Ansible team for anything that happened. They do what they feel is the right thing for their project. I just hate that there’s no feature flag we could set to disable a breaking change.

Upgrading or Starting from Scratch?

Add comment
Sidebar