Linux async plugins go stale in 2.1.0 Beta

CMK version: CEE 2.1.0 Beta (daily build 2022-04-04)
OS version: Debian 11 - Docker Image

Hi all

I am facing issues with async check plugins on both my Linux systems that are being monitored on the checkmk 2.1 beta installation. Both systems are running Debian (one plain, the other is Raspberry Pi OS).
On both systems the agent will suddenly (and seemingly randomly) stop running async check plugins. If I gathered this correctly, the culprit for this seems to be the systemd service check-mk-agent-async that - for some reason - no longer runs the checks.

Whenever I encounter this issue, the cache files in /var/lib/check_mk_agent/cache will no longer get updated. Manually running the checkmk agent rewrites them once. Only restarting the aforementioned service fixes the issue for some time, though.

Is anyone else facing this issue (in which case this would most likely be a bug)?
Otherwise I’ll have to dig deeper on my installation :wink:

Kind regards

EDIT: Maybe the async service somehow hangs. The output of an unhealthy system shows excessive CPU times as well as two PIDs for check_mk_agent. A healthy system on the other hand only uses minor CPU resources and spawns a sleep 60 as second PID.

systemctl status output

Unhealthy system

â—Ź check-mk-agent-async.service - Checkmk agent - Asynchronous background tasks
Loaded: loaded (/lib/systemd/system/check-mk-agent-async.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2022-03-25 11:48:09 CET; 1 weeks 3 days ago
Main PID: 9331 (check_mk_agent)
Tasks: 2 (limit: 2178)
CPU: 3d 23h 49min 27.078s
CGroup: /system.slice/check-mk-agent-async.service
├─ 9331 /bin/bash /usr/bin/check_mk_agent
└─30957 /bin/bash /usr/bin/check_mk_agent

Healthy system
â—Ź check-mk-agent-async.service - Checkmk agent - Asynchronous background tasks
Loaded: loaded (/lib/systemd/system/check-mk-agent-async.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2022-04-04 14:01:53 CEST; 9min ago
Main PID: 965781 (check_mk_agent)
Tasks: 2 (limit: 18999)
Memory: 70.9M
CPU: 6.036s
CGroup: /system.slice/check-mk-agent-async.service
├─965781 /bin/bash /usr/bin/check_mk_agent
└─980292 sleep 60

I can see the same behaviour with 2.0.0p12 (and on some test hosts 2.0.0p20) on RHEL 8 systems. The async works for several (2 or 3?) weeks and then the plugins go stale. And yes, I also have the two running /usr/bin/check_mk_agent processes instead of a running agent process and a sleep 60.

See also this thread:

Still looking for a solution…

1 Like

Thank you for this link, Tanja.
In that case I will track the other topic :slight_smile:

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.