CMK version: 2.1.0p26_0
OS version: Server and agent Debian 11
Error message:
Server: Service - Check_MK - [agent] MKTimeout(‘Fetcher for host “my_server” timed out after 60 seconds’)CRIT, Got no information from hostCRIT, execution time 60.0 sec
CRIT
Check_MK Discovery no unmonitored services found, 37 vanished services (apt:1, checkmk_agent:1, cpu_loads:1, cpu_threads:1, df:3, diskstat:1, kernel_performance:1, kernel_util:1, lnx_if:12, md:4, mem_linux:1, mounts:3, mrpe:3, systemd_units_services_summary:1, tcp_conn_stats:1, timesyncd:1, uptime:1), no new host labels, [agent] MKTimeout(‘Fetcher for host “my_server” timed out after 60 seconds’
Agent: ● cmk-agent-ctl-daemon.service - Checkmk agent controller daemon
Loaded: loaded (/lib/systemd/system/cmk-agent-ctl-daemon.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2023-05-07 16:34:36 CEST; 4h 0min ago
Main PID: 1009 (cmk-agent-ctl)
Tasks: 3 (limit: 77026)
Memory: 7.4M
CPU: 1.537s
CGroup: /system.slice/cmk-agent-ctl-daemon.service
└─1009 /usr/bin/cmk-agent-ctl daemon
May 07 20:23:29 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:59942: Request failed. (Too many active connections)
May 07 20:24:29 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:40886: Request failed. (Too many active connections)
May 07 20:24:34 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:34248: Request failed. (Broken pipe (os error 32))
May 07 20:26:35 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:59088: Request failed. (Broken pipe (os error 32))
May 07 20:28:34 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:50620: Request failed. (Broken pipe (os error 32))
May 07 20:31:29 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:54964: Request failed. (Too many active connections)
May 07 20:32:29 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:39884: Request failed. (Too many active connections)
May 07 20:33:29 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:40948: Request failed. (Too many active connections)
May 07 20:34:29 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:33710: Request failed. (Too many active connections)
May 07 20:34:35 my_server cmk-agent-ctl[1009]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.122.server]:52878: Request failed. (Broken pipe (os error 32))
root@myserver ~ # service check-mk-agent [pressing Tab]
check-mk-agent@67-1009-998 check-mk-agent@70-92535-998
check-mk-agent@68-1009-998 check-mk-agent-async
check-mk-agent@69-1009-998
What is this? Why are there different services here?
When I enter the command “cmk-agent-ctl dump”, I receive no output. However, on other clients, I quickly get an output. anyone have an idea what the problem is here? I reinstalled the checkmk-agent and since then I have this error.
That’s strange. I have only 1 server with 1 instance. I had a different checkmk server before. How can I clean up the unnecessary connections? I don’t have a folder with cmk-agent-ctl-daemon.service.d in my systemd. I’m slowly not understanding anything anymore
Looks like dangling connections. Connections that timeout but never get closed. First get open network connections on the CMK server, for example with lsof or sockstat. Then try
Actually… it worked. Apparently, an Nginx plugin is causing issues here. Once I deactivate it, there are no more problems. Interestingly, the entire agent works fine on another client. There, the Nginx plugin also recognizes a web server.
Should I remove unnecessary plugins or how can I work around this issue? Nginx is not in use on the faulty host, only in a Docker environment as a reverse proxy
$ systemctl status cmk-agent-ctl-daemon
…
WARN [cmk_agent_ctl::modes::pull] [::ffff:10.141.11.248]:40570: Request failed. (Too many active connections)
thanks for the hint to take a closer look at the plugin directory!
/usr/lib/check_mk_agent/plugins/
In my case the nvidia plugin took very long to respond, due to the fact that nvidia-smi showed an error with one of the the GPU cards.
So the solution you guys came up with is to disable the problematic plugin ?
Unfortunately, in my case, this is impossible since the VM are productive and they need to be monitored 24/7.
Restarting the cmk-agent-ctl-daemon.service made the trick but this solution cannot be considered definitive.
In the attached screenshot, as you can see, the agent was “freezed” after ive restarted it i got no problem since the 26/05:
This plugin is present in several monitored VMs but only this is causing issue.
The check of every service are being executed every 2 minute.
Is there a way to prevent the agent from “freezing” again ? Could, maybe, a wrong configuration of the mk_saphana plugin be the root cause of this problem ?
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.