CPU Utilization false alerts

CMK version: 2.0.0p5.cfe
OS version: Ubuntu 20.04.2 LTS

I am using the “CPU utilization for simple device” service monitor rule to monitor CPU utilization on a vCenter Server appliance. The rule is set to the default 80% Warning, 90% Critical thresholds.

For some reason this alert triggers a warning many nights at 12:00AM during a scheduled backup of the appliance even though it’s clear from both the graph in the alert email as well as the vCenter VAMI interface that the % utilization never reaches above 60%.

So I don’t understand why the alert is being triggered.

Hi @jscovill,
The Summary shows 83% - that this is missing in the graph might just be a timing issue. I.e. the alert is created at 00:00:42, but and for this the graph gets pulled, but the 83% measurement is only saved to the graph maybe 1-2 seconds later.
If you check the graph now, I’m assuming it will show a 83% peak on thu apr 7 at 00:01/00:00

But the utilization never reached that according to both the appliance VAMI or the vCenter performance stats. Nor did it exceed 60% in the graph for service:

you’re graphing is showing apr 13 and 14, not the 7th :slight_smile:

There are several options to avoid unnecessary notifications.

  • The Check “CPU utilization for simple devices” offers several tuning knobs where not only the thresholds, but also timeranges in which a value exceeds the threshold can be tuned.
  • You should always use the rulesets “Maximum number of check attemps for hosts/services” to define the number of checks until the hard state is reached.
  • You can consider delaying notifications for specific services if they are still too noisy.
  • You can use time ranges for notifications, so that no notification is generated during the daily backup.
  • As a last option, you can turn off notifications for specific services completely.