This is on my qa site/server, my prod one is fine.
commandline is worth a thousand explanations:
I’m running 2.0.0p5 on RHEL 7 and all my hosts are reporting “DOWN” though all service checks are checking aok.
IF I run
/opt/omd/versions/2.0.0p5.cee/lib/nagios/plugins/check_icmp sqhh-host01
OK - sqhh-host01: rta 0.097ms, lost 0%|rta=0.097ms;200.000;500.000;0; pl=0%;40;80;; rtmax=0.107ms;;;; rtmin=0.092ms;;;;
Likewise “ping sqhh-host01”
ping sqgs-host01
PING sqhh-host01 (10.0.x.x) 56(84) bytes of data.
64 bytes from sqhh-host01 (10.0.x.x): icmp_seq=1 ttl=64 time=0.147 ms
IF I go to
Setup-Hosts-Main directory-Properties of host aqhh-host01-Test connection to host aqhh-host01
PING 10.0.x.x (10.0.x,x) 56(84) bytes of data.
64 bytes from 10.0.x,x: icmp_seq=1 ttl=61 time=3.42 ms
64 bytes from 10.0.x,x: icmp_seq=2 ttl=61 time=3.57 ms
— 10.0.102.77 ping statistics —
2 packets transmitted, 2 received, 0% packet loss, time 201ms
rtt min/avg/max/mdev = 3.422/3.496/3.571/0.095 ms, ipg/ewma 201.542/3.440 ms
set it to ping in “Host Monitoring Rules/Host Check Command”
[root@sqas-toolserver log]# tail cmc.log
2021-08-02 10:02:48 [5] [icmpreceiver 22494] started, commandline: /omd/sites/qa/lib/cmc/icmpreceiver
2021-08-02 10:02:51 [4] [icmpreceiver 22494] closed connection
2021-08-02 10:02:51 [3] [icmpreceiver 22494] exited with status 1
(repeat on and on)
All hosts report “DOWN”
set it back to Smart PING in “Host Monitoring Rules/Host Check Command”
2021-08-02 10:08:58 [5] [icmpsender 25405] started, commandline: /omd/sites/qa/lib/cmc/icmpsender 8 0 1000
2021-08-02 10:08:58 [4] [icmpreceiver 25403] closed connection
2021-08-02 10:08:58 [3] [icmpreceiver 25403] exited with status 1
2021-08-02 10:47:39 [5] [icmpreceiver 8987] started, commandline: /omd/sites/qa/lib/cmc/icmpreceiver
2021-08-02 10:47:39 [5] [core 19105] Executing external command: LOG;SERVICE NOTIFICATION: someone@somewhere.com;sxxx-xxx01;Filesystem /opt/oink;OK;mail;89.76% used (71.77 of 79.96 GB), trend: -1.99 GB / 24 hours
2021-08-02 10:47:39 [4] [icmpsender 8983] Cannot send IP addresses to icmpsender: Broken pipe
2021-08-02 10:47:39 [3] [icmpsender 8983] exited with status 1
All hosts report “DOWN” Left the service notice in there to reaffirm, service checks are working.
Set it to “TCPCONNECT” and a Bunch of hosts come up (however, not doable as the only port most have open is ssh, but not ALL hosts have ssh…
Any ideas on what else to check or what’s going on? This issue precedes my upgrade to Check_mk 2.0 also.
Thanks.