Start Host Downtime After the Host is Down

A host is taken down for maintenance but you forgot to schedule downtime before an alert notification was posted. This is what I do to correct it:

  1. Fake the check result to Host Up. Assuming you have clearing event notifications configured, one will be posted immediately.
  2. Quickly, before the next host check, put the host into downtime. A Host Down alert will be reasserted but without the notification.

I am at a loss here: Why would you not simply schedule a downtime once the host is down, and you realize your error? This sounds like an unnecessarily complicated process.

P.S.: I moved the post to the right category.

Fair question, Robin, and obviously not applicable for all Check_MK administrators.

On our system, notifications are sent to a secondary alert management service that generates incident tickets and assigns them to the appropriate support team for the host that is down. That service cancels the ticket if the host comes back up before corrective action can be taken. This can, of course, be prevented altogether if the host is placed in downtime before going down but, all too often, downtime is forgotten until it’s too late. Incident creation can be prevented or cancelled by posting a “Host Up” notification, followed immediately by downtime.

why not change that so that a “host down” followed by “downtime start” also cancels the incident creation :)?

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.