The ERM syntax is a bit baroque (and not well documented), so let's work through the example: this is the configuration you need to detect high overall CPU utilization on the main CPU in the box:
policy HighGlobalCPU global
critical rising 95 falling 70 interval 10
major rising 75 falling 50 interval 10
user global HighGlobalCPU
And here are the usage/configuration guidelines:
- The whole ERM subsystem is configured under the resource policy section;
- You always have to configure a policy and a user to which the policy applies. In our example, the user is global (as we're measuring the global CPU load);
- The policy we're defining must have the global keyword to indicate we're measuring overall utilization (otherwise you can't attach it to the global user);
- We're measuring the load on the main CPU, so we're configuring the system subsection of the policy (on distributed platforms you could specify slot name to measure utilization on a specific linecard);
- The cpu section selects CPU load measurements. You could measure interrupt load, process load or total CPU load.
- Within each resource section in the policy (in our example, total CPU load on the main system) you can define minor, major and critical thresholds (syslog messages are generated when each threshold is crossed).
- After the policy is defined, it's applied to the global user.
With the CPU load measurement policy defined, the router will generate syslog messages (SYS-4-CPURESRISING) every time the overall CPU load exceeds the specified rising thresholds. When the utilization falls below the falling threshold, the SYS-4-CPURESFALLING syslog message is generated.