We have about 100 Windows Servers which we check over the agent.
Everything is running smooth.
But once a mont, we have a host, which is not reachable anymore from checkmk.
Sometimes, it is reachable from another host with port 6556 but not from checkmk.
We had that in 1.6 and also in 2.0. We tested several versions, also including with updating the agent on the host.
The only way to get this host back into an active state is to uninstall the agent.
Delete C:\Programdata\checkmk
install agent again.
I haven’t found anything which can lead to this timeout from checkmk.
this sounds like a problem related to firewall settings or anti virus programs which block the request to the agent. Especially because you say it happens in different versions of you monitoring server and agent. Have you also tried to restart the service on windows if the problem occurs?
Hi, now I have a host which is unreachable from checkmk:
I was able to connect 2 times but nothing returned, after that the connection gets dropped:
root@checkmk:~# telnet 192.168.11.228 6556
Trying 192.168.11.228...
Connected to 192.168.11.228.
Escape character is '^]'.
Connection closed by foreign host.
root@checkmk:~# telnet 192.168.11.228 6556
Trying 192.168.11.228...
Connected to 192.168.11.228.
Escape character is '^]'.
Connection closed by foreign host.
root@checkmk:~# telnet 192.168.11.228 6556
Trying 192.168.11.228...
telnet: Unable to connect to remote host: Connection refused
root@checkmk:~# telnet 192.168.11.228 6556
Trying 192.168.11.228...
telnet: Unable to connect to remote host: Connection refused
root@checkmk:~#
Windows Firewall is disabled
Windows virus & threat protection is disabled
Windows app & browser control is disabled
no additional Anti Virus is installed
Can you also check, if there is a setting at the windows service which only allows certain IPs to contact this service? There is an option inside the agent config too with this function. But this wouldn’t explain why it works at the beginning and stops working after some time.
Can you also check, if this situation occurs, if the post 6556 is open and listening on the windows machine? If it’s not, there is maybe something wrong with the agent. If it’s open and listening, the network traffic is blocked between the host and the agent.
Haven’t found an IP restriction on the CheckMK Service.
Also the agent is not restricted to any specific IPs
The server still listens to the port TCP 0.0.0.0:6556 0.0.0.0:0 LISTENING
I can successfully from the server on which is the agent installed.
Also from another host, I can connect to the agent.
But not from checkmk anymore. I’ve also checked the firwall, there is no block otherwise I would become an timeout.
Maybe a problem on the checkMK server host itself?
What does your monitoring server say about connecting to this port (not via telnet client). nc -nuv <ip> 6556 (maybe on your server the command is called netcat)
A totally different question, have you enabled encrypted agent communication?
The Check MK Service service terminated unexpectedly. It has done this 1336 time(s). The following corrective action will be taken in 2000 milliseconds: Restart the service.
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.