Host Checks - Time and Attempt Settings

Hello ,
I set our CheckMK to a maximum Number of Check attempts for Host to 5 and normal check Intervals to 3 minutes , with an Retry check to 1 minute.
But now when i do an Restart of an Windows Server CheckMK dont sent me an Email .
Because i want to become an Mail when an Server is longer Restarting as normal . Maybe 2 Minutes.

Thanks
Michael

Hi @Michael_BGL and welcome to the Forum!

Since you have set the “maximum number of check attempts” to 5, and
the “check interval” to 3 the hard state “DOWN”, is most likely
not reached with a simple reboot.

If you want to be notified about this, you may need to lower those values.

Please see the official documentation for more details: Notifications - Repeated check attempts

HTH,
Thomas

1 Like

Hello ,
Thanks for your fast answer.
Means checkMK trys 5 times with an relay of 3 Minutes . Right . Means the Host shows me Critical only when he is more the 15 Minutes down ?

Thanks

Hey Michael,

No, it will show you - in your webinterface - as soon as it notices that the host is DOWN. But it will only notify you, if a HARD state is reached, which - according to your configuration - will be the case after 15 minutes.

Thomas

2 Likes

Ah ok , the CheckMK Agent see the Host Down state exactely when it happens , but the Notification about this comes 15 miuntes later.
How can i play with this two Settings ? Five times try but shorter time between checks or less checks with longer time between ?

Hi Michael,

I believe I didn’t answer correctly… You have set the following parameters:

  • Host check interval: 3
  • Check attemtps: 5
  • Retry check interval: 1

A host check is thus, executed every 3 mins.
Let’s say it’s down on the first attempt

  1. Retry interval is 1 min times 5 check attempts = 5 additional minutes
  2. During each retry, this state is soft
  3. The first notification should thus come after approx. 5 minutes.

And this is because - with this configuration - a hard state will be reached after
5 minutes.

If this is too much i.e. “too late”, you need to modify those values to fit
your requirements. You could e.g. set shorter check intervals/less check attempts
for “more critical” hosts.

Regards,
Thomas

2 Likes

Not quite: The first check detecting a problem is already the first attempt. So this should be:

5 check attempts = 4 additional minutes

Apart from that, great answers @openmindz!

1 Like