Agent not listening on port 6556, but is running

CMK version: 2.2.0p26
OS version: Debian Bookworm

I updated our checkmk-server from 2.2.0pX to 2.2.0p26, same to the agents. For one agent I receive the following error:

Services: all up to date, Host labels: all up to date, [agent] Communication failed: [Errno 111] Connection refuse

The agents are there:

-rw-r--r-- 1 root root 330  8. Mai 12:04  check-mk-agent-async.service
-rw-r--r-- 1 root root 354  8. Mai 12:04 'check-mk-agent@.service'
-rw-r--r-- 1 root root 246  8. Mai 12:04  check-mk-agent.socket

They are running:

systemctl status check-mk-agent.socket
â—Ź check-mk-agent.socket - Local Checkmk agent socket
     Loaded: loaded (/lib/systemd/system/check-mk-agent.socket; enabled; preset: enabled)
     Active: active (listening) since Fri 2024-05-17 09:28:01 CEST; 19min ago
       Docs: https://docs.checkmk.com/latest/en/agent_linux.html
     Listen: /run/check-mk-agent.socket (Stream)
   Accepted: 0; Connected: 0;
      Tasks: 0 (limit: 193272)
     Memory: 0B
        CPU: 758us
     CGroup: /system.slice/check-mk-agent.socket

Mai 17 09:28:01 mail systemd[1]: Starting check-mk-agent.socket - Local Checkmk agent socket...
Mai 17 09:28:01 mail systemd[1]: Listening on check-mk-agent.socket - Local Checkmk agent socket.

Also the async-Service:

 systemctl status check-mk-agent-async.service
â—Ź check-mk-agent-async.service - Checkmk agent - Asynchronous background tasks
     Loaded: loaded (/lib/systemd/system/check-mk-agent-async.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-05-17 09:36:28 CEST; 11min ago
       Docs: https://docs.checkmk.com/latest/en/agent_linux.html
   Main PID: 294785 (check_mk_agent)
      Tasks: 2 (limit: 193272)
     Memory: 1.6M
        CPU: 883ms
     CGroup: /system.slice/check-mk-agent-async.service
             ├─294785 /bin/bash /usr/bin/check_mk_agent
             └─295257 sleep 60

Mai 17 09:36:28 mail systemd[1]: Started check-mk-agent-async.service - Checkmk agent - Asynchronous background tasks.

But there is no process listening on port 6556:

 netstat -tulpn|grep 6556

This is the process list:

 ps ax|grep check
 294785 ?        Ss     0:00 /bin/bash /usr/bin/check_mk_agent

There is no firewall configured for this server, at least not for the interal ip-address.

So why is the service not listening on port 6556, any ideas?

First of all, what’s the content of check-mk-agent.socket? Please paste the output of systemctl cat check-mk-agent.socket.

Next, try restarting that unit & see if that changes things:

systemctl restart check-mk-agent.socket
lsof -PniTCP:6556 -sTCP:LISTEN

Obviously run all commands as root.

Attention if this is a Debian Bookworm and agent 2.2 then the service check-mk-agent@.service should not be there.
Normally you have now the cmk-agent-ctl-daemon.service and this service is also using the port 6556.

ss -tulpn | grep 6556
tcp   LISTEN 0      4096               *:6556             *:*    users:(("cmk-agent-ctl",pid=1206529,fd=9))

First check the status of the agent controller with systemctl status cmk-agent-ctl-daemon

Thanks for your reply.

Output of check-mk-agent.socket:

# systemctl cat check-mk-agent.socket
# /lib/systemd/system/check-mk-agent.socket
[Unit]
Description=Local Checkmk agent socket
Documentation=https://docs.checkmk.com/latest/en/agent_linux.html

[Socket]
ListenStream=/run/check-mk-agent.socket
SocketUser=cmk-agent
SocketMode=0240
Accept=true

[Install]
WantedBy=sockets.target

Output of lsof:

# lsof -PniTCP:6556 -sTCP:LISTEN
#

The socket is a local socket and not a network one.
That’s correct.

# cat /etc/debian_version
12.5

I downloaded the agent from checkmk “Setup->Agent->Linux” as deb and installed it:

apt install ./check-mk-agent_2.2.0p26-1_all.deb

But this is not a fresh install, old version was there before.

Output of cmk-agent-ctl-daemon.service:

# systemctl status cmk-agent-ctl-daemon.service
Ă— cmk-agent-ctl-daemon.service - Checkmk agent controller daemon
     Loaded: loaded (/lib/systemd/system/cmk-agent-ctl-daemon.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Sun 2024-05-12 12:24:38 CEST; 4 days ago
   Duration: 1ms
       Docs: https://docs.checkmk.com/latest/en/agent_linux.html
    Process: 231 ExecStart=/usr/bin/cmk-agent-ctl daemon (code=exited, status=226/NAMESPACE)
   Main PID: 231 (code=exited, status=226/NAMESPACE)
        CPU: 1ms

Mai 12 12:24:38 mail systemd[1]: cmk-agent-ctl-daemon.service: Scheduled restart job, restart counter is at 5.
Mai 12 12:24:38 mail systemd[1]: Stopped cmk-agent-ctl-daemon.service - Checkmk agent controller daemon.
Mai 12 12:24:38 mail systemd[1]: cmk-agent-ctl-daemon.service: Start request repeated too quickly.
Mai 12 12:24:38 mail systemd[1]: cmk-agent-ctl-daemon.service: Failed with result 'exit-code'.
Mai 12 12:24:38 mail systemd[1]: Failed to start cmk-agent-ctl-daemon.service - Checkmk agent controller daemon.

That’s the problem. You need to look inside log what happend there.

2024-05-17T11:05:25.816252+02:00 hostname systemd[1]: Started cmk-agent-ctl-daemon.service - Checkmk agent controller daemon.
2024-05-17T11:05:25.817168+02:00 hostname (gent-ctl)[298279]: cmk-agent-ctl-daemon.service: Failed to set up mount namespacing: Permission denied
2024-05-17T11:05:25.817260+02:00 hostname (gent-ctl)[298279]: cmk-agent-ctl-daemon.service: Failed at step NAMESPACE spawning /usr/bin/cmk-agent-ctl: Permission denied
2024-05-17T11:05:25.817725+02:00 hostname systemd[1]: cmk-agent-ctl-daemon.service: Main process exited, code=exited, status=226/NAMESPACE
2024-05-17T11:05:25.818101+02:00 hostname systemd[1]: cmk-agent-ctl-daemon.service: Failed with result 'exit-code'.

What means “mount namespacing”?

OK, googled a bit and found a relation to LXC-containers on proxmox, that’s what I use. The LXC should have nesting enabled, what I alread have. Found a command that will enforce nesting on that LXC on the host:

pct set CTID -features nesting=1 && pct reboot CTID

Will try that and report back.

1 Like

That last command worked, Agent is listening on port 6556 now and is reachable in checkmk.

This was a helpful thread for me over the weekend, when I had some trouble with a Debian LXC of my own. Two things I noticed:

  1. Enabling nesting for this Debian-LXC actually helped.
  2. Due to feedback from a colleague, I tried an Ubuntu-LXC next. With the container I did not have to activate nesting. The agent controller ran without any issue after installing the agent.

I have no clue what the difference between the Debian CT template and the Ubuntu template (both taken directly from Proxmox) is. At the moment I don’t have time to investigate, but to anybody who a) does not have to use Debian and b) does not want to activate nesting, an Ubuntu-LXC might be the way to go.

P.S.: Of course, it would be nice to find out, what is actually going on.

1 Like

hello i have the issue on version 2.4 and on windows
installed but doesent listen on that port why?
Every other monitoring software works but i try to confiure and nothing works
First Agent isntall was deleted by defender thats now done
I dont want to fix anytime other issues what can i do ???

You should find here Beware of 2.3.0p23 if you're using Defender! some steps you can do.