Not alerting for reboots

mgillespie1981 · April 15, 2024, 1:18pm

We have a request to monitor a number of hosts that will reboot randomly during the day or at night and we should NOT alert on this.

What would be the best way to do this?
I know we could maybe negate notifications for these hosts but then that would stop all notifications.
We will want to alert on the other services.

aeckstein · April 15, 2024, 1:43pm

Hi Marc,

you could write a powershell / bash script that is being triggered with the reboot from the underlying os that sets a downtime in checkmk through the REST API.
Or if the reboot is orchestrated by another tool, the tool could also set downtimes in checkmk via scripts or REST calls.

If that is to complex and there is no other indicator when the downtime is triggered, you could just raise the nr. of check attempts for the host checks.

mschlenker · April 15, 2024, 1:45pm

Might be even easier:

mgillespie1981 · April 15, 2024, 1:47pm

Yes but they can be rebooted at random times so schedules downtimes wouldn’t work then?

mschlenker · April 15, 2024, 1:58pm

We know of Checkmk users that schedule a 10 minute scheduled downtime from 7:00 to 20:00 to allow their admins to perform any short maintenence during work hours. You might extend this to a 10 minute scheduled dowtime from 0:00 to 23:59 and will only get notified if the downtime covers midnight.

Edit: This will only work out of the box for a single reboot per day.

mgillespie1981 · April 15, 2024, 2:55pm

These are personal boxes so I cannot predict when the downtime will occur. We want to essentially NOT alert for downtime but for everything else.

So from what you’ve said above this could be done by setting a scheulde from 0:00 to 23:59 but with a 24 hour schedule downtime instead of 10 minutes?

mgillespie1981 · April 15, 2024, 3:00pm

A custom range like this? And then scheduled downtime on host?
They still want to alert on items such as CPU, disk space etc…

jsmyth · April 15, 2024, 3:27pm

Hi Marc,

I think there are a few options here, depending on the desired behaviour and the specifics of your situation.

The simplest solution is probably to disable the host-level notifications for these hosts, but that assumes that you do not want to alert on host availability at all. Note that host notifications and service notifications are different, so you should be able to disable host notifications while retaining alerts for all of the host’s services.

If the requirement is to ignore downtime only if it is temporary because of a(n) (un)planned reboot, I think Andre’s suggestion of increasing the number of check attempts before Checkmk flags these hosts as down is probably the way to go.

Alternately, if you can be sure that there is no more than 1 reboot per day for any 1 of these hosts, Mattias’ suggestion of recurring scheduled downtimes may work.

Hope this helps,
Jason

mgillespie1981 · April 15, 2024, 3:35pm

How can I disable the host-level notifications in that case? I think this might be the only solution here.

jsmyth · April 15, 2024, 3:44pm

There may be better options, but, if nothing else, you should be able to create a rule set to “Cancel previous notifications” (rather than the default “Create notification with the following parameters”), then, under Conditions, select Match host event type and match all host event types. Set additional conditions, as appropriate, to ensure you still get host event notifications for systems that require them.

mgillespie1981 · April 15, 2024, 3:48pm

Sorry but how do I disable the host-level notifications firstly?

rons4 · April 16, 2024, 6:51am

Besides what was mentioned already, some alerting tools (e.g. SIGNL4) offer delayed notifications. So, if there was a temporary issue, like a server reboot but the server is up again after a few minutes you will not receive the wake-up call. Also, filtering for certain alert types can be found here.

mgillespie1981 · April 16, 2024, 9:19am

How can I disable host level notifications please?

dineshinspace · April 16, 2024, 9:28am

Hi @mgillespie1981 ,

Please refer below rule settings for disabling Host Notifications.

Regards,
DD

mgillespie1981 · April 16, 2024, 9:41am

And to confirm this means it ignores alerts for down hosts but still would monitor and alert on the services?

dineshinspace · April 16, 2024, 10:13am

This Rule only disables host notifications. Monitoring will work as it is.

As per Rule description, Service notifications work as it is.