All hosts marked "down". Service checks ok. Can ping fine from commandline to servers

This is on my qa site/server, my prod one is fine.
commandline is worth a thousand explanations:

I’m running 2.0.0p5 on RHEL 7 and all my hosts are reporting “DOWN” though all service checks are checking aok.
IF I run

/opt/omd/versions/2.0.0p5.cee/lib/nagios/plugins/check_icmp sqhh-host01

OK - sqhh-host01: rta 0.097ms, lost 0%|rta=0.097ms;200.000;500.000;0; pl=0%;40;80;; rtmax=0.107ms;;;; rtmin=0.092ms;;;;
Likewise “ping sqhh-host01”

ping sqgs-host01

PING sqhh-host01 (10.0.x.x) 56(84) bytes of data.
64 bytes from sqhh-host01 (10.0.x.x): icmp_seq=1 ttl=64 time=0.147 ms
IF I go to
Setup-Hosts-Main directory-Properties of host aqhh-host01-Test connection to host aqhh-host01
PING 10.0.x.x (10.0.x,x) 56(84) bytes of data.
64 bytes from 10.0.x,x: icmp_seq=1 ttl=61 time=3.42 ms
64 bytes from 10.0.x,x: icmp_seq=2 ttl=61 time=3.57 ms
— 10.0.102.77 ping statistics —
2 packets transmitted, 2 received, 0% packet loss, time 201ms
rtt min/avg/max/mdev = 3.422/3.496/3.571/0.095 ms, ipg/ewma 201.542/3.440 ms

set it to ping in “Host Monitoring Rules/Host Check Command”
[root@sqas-toolserver log]# tail cmc.log
2021-08-02 10:02:48 [5] [icmpreceiver 22494] started, commandline: /omd/sites/qa/lib/cmc/icmpreceiver
2021-08-02 10:02:51 [4] [icmpreceiver 22494] closed connection
2021-08-02 10:02:51 [3] [icmpreceiver 22494] exited with status 1
(repeat on and on)
All hosts report “DOWN”

set it back to Smart PING in “Host Monitoring Rules/Host Check Command”
2021-08-02 10:08:58 [5] [icmpsender 25405] started, commandline: /omd/sites/qa/lib/cmc/icmpsender 8 0 1000
2021-08-02 10:08:58 [4] [icmpreceiver 25403] closed connection
2021-08-02 10:08:58 [3] [icmpreceiver 25403] exited with status 1
2021-08-02 10:47:39 [5] [icmpreceiver 8987] started, commandline: /omd/sites/qa/lib/cmc/icmpreceiver
2021-08-02 10:47:39 [5] [core 19105] Executing external command: LOG;SERVICE NOTIFICATION: someone@somewhere.com;sxxx-xxx01;Filesystem /opt/oink;OK;mail;89.76% used (71.77 of 79.96 GB), trend: -1.99 GB / 24 hours
2021-08-02 10:47:39 [4] [icmpsender 8983] Cannot send IP addresses to icmpsender: Broken pipe
2021-08-02 10:47:39 [3] [icmpsender 8983] exited with status 1
All hosts report “DOWN” Left the service notice in there to reaffirm, service checks are working.

Set it to “TCPCONNECT” and a Bunch of hosts come up (however, not doable as the only port most have open is ssh, but not ALL hosts have ssh…

Any ideas on what else to check or what’s going on? This issue precedes my upgrade to Check_mk 2.0 also.
Thanks.

1 Like

Hi j_Lowe,

I had the same issue. This was my case.
These files should have SUID but somehow lost it due to some change.

/opt/omd/sites/cmktcs/lib/cmc/icmpsender
/opt/omd/sites/cmktcs/lib/cmc/icmpreceiver
/opt/omd/sites/cmktcs/lib/nagios/plugins/check_icmp

After performing “chmod u+s” to those files, the hosts in web console are coming back.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.