Hello,
I have a problem with active checks providing false positives to the amount that any real positive would not be noticed any more.
This is part of the “HTTPS Webserver OK > CRIT” Message.
|Date / Time|Tue Aug 30 00:09:07 CEST 2022|
|Summary|CRITICAL - Socket timeout after 10 seconds|
|Details||
|Host Metrics|rta=0.137ms;200.000;500.000;0; pl=0%;80;100;; rtmax=0.284ms;;;; rtmin=0.063ms;;;;|
And this is the OK message I get 50 seconds later.
Tue Aug 30 00:09:57 CEST 2022
Summary HTTP OK: HTTP/1.1 302 Found - 784 bytes in 0.733 second response time
Details
Host Metrics rta=0.086ms;200.000;500.000;0; pl=0%;80;100;; rtmax=0.209ms;;;; rtmin=0.050ms;;;;
Service Metrics time=0.733495s;;;0.000000;10.000000 size=784B;;;0
I get too many of these messages to believe that they are indicative of a real problem on my site. Besides, the server, not the Website on it, is alive and reachable via IP 100% of the time. This leaves me with two questions: 1) Why do I get so many false positives with the default settings? 2) Is there anything I can change globally to make all self-defined active checks a useful source of information again? I am using Checkmk 2.0.0p23 (CRE), Server and Client run on Proxmox VMs with Debian11.
Yours sincerely
Stefan Schumacher