False Down Notifications Checkmk

Hi,

we have the problem that checkmk is sending us a warning (via Teams):

PROBLEM | Hostname : srv-print2 is DOWN -> time was 03:38
        Detail
        CRITICAL - 10.4.200.54: rta nan, lost 100%  
        Affected groups
        check_mk

And I dont understand what is happening in that moment, the checkmk agent is installed on the virtual server “srv-print2”.
The server was not rebooted and is still up but when I check the graph in checkmk I see a packet loss:

Two questions now:

  1. Why I get this packet loss, is here a few seconds a packet loss or are the notifications false?
  2. Is it possible to change the units here in the raw edition? I don’t think that “µs” is helpful for our monitoring.

EDIT: I heared that the current probably was away for a few seconds but not all virtual machines send me the alert.

Thanks in advance.

The “µs” is the normal round trip time you need inside a LAN environment. Your machine was around 4 minutes not reachable. With your power outage it makes sense. Some seconds no power, your virtualization host goes down and reboots after 4 minutes it is ready again and you can reach your VM again.
If not all VM hosts where in reboot state then it would be possible that you only get notifications from some systems.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.