Checkmk Agent becomes a zombie through Docker Check Plugin

CMK version: Checkmk Cloud Edition 2.2.0p12 (eval)
OS version: Ubuntu Server 22.04

Error message: No feedback from the server to be monitored. All CheckMK agent processes are zombies. (Docker Server only because the Plugin)

Dockerplugin was installed as root.

pip3 install docker

Status of the Agent on a docker Server:

cmk-agent-ctl status
Version: 2.2.0p12
Agent socket: operational
IP allowlist: any
Legacy mode: enabled
No connections

 343566 ?        S      0:00  \_ /bin/bash /usr/bin/check_mk_agent
 343568 ?        S      0:00      \_ cat
 516871 ?        Ss     0:00 /bin/bash /usr/bin/check_mk_agent
 516890 ?        S      0:00  \_ /bin/bash /usr/bin/check_mk_agent
 517109 ?        S      0:00  |   \_ python3 ./mk_docker.py
 517146 ?        S      0:00  |       \_ python3 ./mk_docker.py
 517151 ?        Z      0:00  |       \_ [python3] <defunct>
 517152 ?        Z      0:00  |       \_ [python3] <defunct>
 517153 ?        Z      0:00  |       \_ [python3] <defunct>
 517154 ?        Z      0:00  |       \_ [python3] <defunct>
 517155 ?        Z      0:00  |       \_ [python3] <defunct>
 517156 ?        Z      0:00  |       \_ [python3] <defunct>
 517158 ?        Z      0:00  |       \_ [python3] <defunct>
 517159 ?        Z      0:00  |       \_ [python3] <defunct>
 517163 ?        Z      0:00  |       \_ [python3] <defunct>
 516891 ?        S      0:00  \_ /bin/bash /usr/bin/check_mk_agent
 516893 ?        S      0:00      \_ cat
 869527 ?        Ss     0:00 /bin/bash /usr/bin/check_mk_agent
 869546 ?        S      0:00  \_ /bin/bash /usr/bin/check_mk_agent
 869781 ?        S      0:00  |   \_ python3 ./mk_docker.py
 869804 ?        S      0:00  |       \_ python3 ./mk_docker.py
 869809 ?        Z      0:00  |       \_ [python3] <defunct>
 869810 ?        Z      0:00  |       \_ [python3] <defunct>
 869811 ?        Z      0:00  |       \_ [python3] <defunct>
 869812 ?        Z      0:00  |       \_ [python3] <defunct>
 869813 ?        Z      0:00  |       \_ [python3] <defunct>
 869815 ?        Z      0:00  |       \_ [python3] <defunct>
 869816 ?        Z      0:00  |       \_ [python3] <defunct>

My solution of the time is:

systemctl stop cmk-agent-ctl-daemon.service
killall check_mk_agent                     
systemctl start cmk-agent-ctl-daemon.service

Many thanks for some help :wink:

Hi @linux,

have you checked the execution time of the Docker plugin when executed manually?

Would be interesting to see if the mk_docker can execute at all.

Best Regards
Norm

We had the same problem several times, but couldn’t find the cause. The mk_docker Plugin simply continues to run indefinitely. We then terminated all monitoring processes and everything ran again

A few ideas from us:

Yes this works normal.

@LaSoe
Ok, i’ve changed the checkdelay of the docker plugin to every 5 minutes. Maybe it helps.

@linux
you can run the plugin asynchronously, then the current 2.2 agent should automatically terminate the process if it’s running longer then twice the check interval time (I have not tested if this really works).

Another advantage is that the agent does not have to wait for the plugin to finish in the event of slow processing and can deliver all other states independently of this check.

1 Like

Config with the 5 minutes actually seems to have worked. It has not occurred again since then. :sunglasses: