Checkmk Agent Not Listening on 6556 After Reinstalling Agent v2.1.0

CMK version: 2.1.0
OS version: Ubuntu 18.04

Error message: None

We updated the monitoring agent from version 1.2.8p16 to version 2.1.0. Everything seems to be working normally, but nothing is binding to port 6556.

We first upgraded, then tried purging and re-installing but it didn’t help. We also purged xinetd and all of its configuration from the machine, since the Checkmk agent was the only thing running under it.

Systemd shows that all services are running normally:

jasons@agentServer:~$ sudo systemctl status check-mk-agent.socket check-mk-agent-async.service cmk-agent-ctl-daemon.service
● check-mk-agent.socket - Local Checkmk agent socket
   Loaded: loaded (/lib/systemd/system/check-mk-agent.socket; enabled; vendor preset: enabled)
   Active: active (listening) since Wed 2022-11-02 15:25:04 CDT; 13min ago
   Listen: /run/check-mk-agent.socket (Stream)
 Accepted: 0; Connected: 0
    Tasks: 0 (limit: 19140)
   CGroup: /system.slice/check-mk-agent.socket

Nov 02 15:25:04 agentServer systemd[1]: Starting Local Checkmk agent socket.
Nov 02 15:25:04 agentServer systemd[1]: Listening on Local Checkmk agent socket.

● check-mk-agent-async.service - Checkmk agent - Asynchronous background tasks
   Loaded: loaded (/lib/systemd/system/check-mk-agent-async.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2022-11-02 15:25:05 CDT; 13min ago
 Main PID: 130047 (check_mk_agent)
    Tasks: 2 (limit: 19140)
   CGroup: /system.slice/check-mk-agent-async.service
           ├─ 42019 sleep 60
           └─130047 /bin/bash /usr/bin/check_mk_agent

Nov 02 15:25:05 agentServer systemd[1]: Started Checkmk agent - Asynchronous background tasks.

● cmk-agent-ctl-daemon.service - Checkmk agent controller daemon
   Loaded: loaded (/lib/systemd/system/cmk-agent-ctl-daemon.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2022-11-02 15:25:06 CDT; 13min ago
 Main PID: 130148 (cmk-agent-ctl)
    Tasks: 3 (limit: 19140)
   CGroup: /system.slice/cmk-agent-ctl-daemon.service
           └─130148 /usr/bin/cmk-agent-ctl daemon

Nov 02 15:25:06 agentServer systemd[1]: Started Checkmk agent controller daemon.

What are some other troubleshooting steps we can take to figure out what the issue is?

Hallo,
try the tips from the handbook.

Did you update the OS too?

Ralf

I assume some leftover xinetd configuration. Please run:

ss -tulpn | grep 6556

…and post the output.

I followed the troubleshooting steps from the Monitoring Linux page before posting:

Did you update the OS too?

No, only the check-mk-agent package was updated.

The output is empty, because nothing is binding to that port:

jasons@agentServer:~$ sudo ss -tulpn | grep 6556
jasons@agentServer:~$

We also considered that some leftover xinetd configuration was causing an issue, but purging xinetd (sudo apt purge xinetd --autoremove) and then purging and reinstalling the check-mk-agent package did not help.

According to the Monitoring Linux page, empty output from ss -tulpn | grep 6656 means that the requirements for running the agent controller have not been met, but we have other servers running nearly identical configurations where the agent works without issue. The only notable difference is that the server that is having an issue is the only Linux server where we upgraded from version 1.x of the agent; on every other Linux server version 2.1.0 was the first version installed.

Any additional thoughts or guidance are appreciated.

Thank you,
Jason

There might be some leftovers from previous installations be present. So you may want to first purge the agent, then delete /etc/check_mk and /var/lib/cmk-agent and afterwards make sure no service named “check_mk*” or “check-mk*” is running anymore.

Then install the agent with manually running the post install scripts. You might decide yourself whether you want to use the new encrypted agent controller or the old style xinetd agent.

Installation without postinst:

dpkg --unpack <package>check-mk-agent_2.1.0p15.deb
rm -f /var/lib/dpkg/info/check-mk-agent.postinst
dpkg --configure check-mk-agent

Then if you want to configure for the new agent controller:

bash /var/lib/cmk-agent/scripts/cmk-agent-useradd.sh
bash /var/lib/cmk-agent/scripts/super-server/0_systemd/setup

If you want to use xinetd:

bash /var/lib/cmk-agent/scripts/super-server/1_xinetd/setup

If no errors are encountered, confirm with ss -tulpn | grep 6556 that the agent is running.

1 Like

Hi Mattias,

Thank you for the detailed instructions.

I followed the process for manually configuring the new agent controller but, even though I did not encounter any errors, the agent is still not listening on any TCP ports.

systemctl status cmk-agent-ctl-daemon.service shows that the service is active and running. The only log output is Started Checkmk agent controller daemon. Is there somewhere else we can look to get more detailed logs that might explain why the agent controller is failing to bind to TCP port 6556?

Thanks in advance,
Jason

I’ll dig deeper in my work hours on monday. Probably I have to setup an Ubuntu 18.04 test install. Just one question: Was this Ubuntu upgraded from a previous LTS like 16.04 or even in two steps from 14.04?

Hi Mattias,

The server has always been Ubuntu 18.04.

In case it is relevant, I believe the original Checkmk agent install (v1.2.8p16) came from the Ubuntu repositories. The upgrade version is vanilla 2.1.0 and was downloaded from our Checkmk server.

Thanks again for your help,
Jason

Just one thing to check: Does the command

cmk-agent-ctl --version -vv

actually print the version string or does it fail to start?

Hi Mattias,

The command seems to complete successfully:

jasons@agentServer:~$ sudo cmk-agent-ctl --version -vv
cmk-agent-ctl 2.1.0
jasons@agentServer:~$

Additionally, the expected services seem to be running:

jasons@agentServer:~$ ps -aux | grep -E 'c(heck([_-])?)?mk'
root      24030  0.0  0.0  14376  4280 ?        Ss   Nov04   0:18 /bin/bash /usr/bin/check_mk_agent
cmk-age+  24110  0.0  0.0  14824 10384 ?        Ssl  Nov04   0:00 /usr/bin/cmk-agent-ctl daemon
jasons@agentServer:~$

The above ps command returns the same results on both a working host and the problem host.

Regards,
Jason

Have the same issue on 18.04 with same steps to troubleshoot. 6556 not listening… I reverted to 2.0.0 agent which works.

i have same issue , only on ipv6 port is open
try modify /etc/systemd/system/check_mk.socket the line =6556 to =0.0.0.0:6556

or execute that
sed -i ‘s/=6556/=0.0.0.0:6556/g’ /etc/systemd/system/check_mk.socket; systemctl daemon-reload;systemctl restart check_mk.socket

This seems different than the issue I am having; in my case the service is not listening on any TCP port, neither IPv4 nor IPv6.

Also in my case, the file /etc/systemd/system/check_mk.socket does not exist. I checked a different VM where the agent is working properly and found that the file does not exist there either, so the lack of file does not seem to be the likely cause for the issue I am having.

One difference I did note between the machines, though, is the number of Accepted connections shown under the check-mk-agent.socket. On the working agent, the number is roughly equal to the number of minutes since the service was started; on the non-working agent, the number is 0:

jasons@brokenHost:~$ systemctl status check-mk-agent.socket
● check-mk-agent.socket - Local Checkmk agent socket
   Loaded: loaded (/lib/systemd/system/check-mk-agent.socket; enabled; vendor preset: enabled)
   Active: active (listening) since Fri 2022-11-04 09:45:31 CDT; 5 days ago
   Listen: /run/check-mk-agent.socket (Stream)
 Accepted: 0; Connected: 0
    Tasks: 0 (limit: 19140)
   CGroup: /system.slice/check-mk-agent.socket

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

jasons@workingHost:~$ systemctl status check-mk-agent.socket
● check-mk-agent.socket - Local Checkmk agent socket
   Loaded: loaded (/lib/systemd/system/check-mk-agent.socket; enabled; vendor preset: enabled)
   Active: active (listening) since Wed 2022-11-02 14:45:52 CDT; 6 days ago
   Listen: /run/check-mk-agent.socket (Stream)
 Accepted: 10059; Connected: 0
    Tasks: 0 (limit: 19140)
   CGroup: /system.slice/check-mk-agent.socket

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

I suspect the lack of connections is related to the lack of TCP binding, but I don’t know if it is cause or effect.

Any additional insights are appreciated.

Thanks,
Jason

I have been able to reproduce this issue with the following process:

  1. Deploy a new VM running Ubuntu 18.04.

  2. Install check-mk-agent and xinetd from the repositories.
    sudo apt install check-mk-agent xinetd

  3. Upgrade to v2.1.0.
    sudo apt install /tmp/check-mk-agent_2.1.0-1_all.deb

  4. Remove check-mk-agent.
    sudo apt purge check-mk-agent
    sudo rm -r /var/lib/cmk-agent /var/lib/check_mk_agent

  5. Remove xinetd.
    sudo apt remove xinetd --autoremove

  6. Re-install check-mk-agent.
    sudo apt install /tmp/check-mk-agent_2.1.0-1_all.deb

It seems likely that there is some leftover xinetd-related configuration, but I’m at a loss as to where I should be looking for it.

Any advice is appreciated.

Thanks,
Jason

EDIT: Turns out that the 1.x agent does not even to be enabled to reproduce the issue so I dropped a couple of unnecessary steps.

EDIT2: Turns out that I am able to reproduce the issue without upgrading and without xinetd. The following process seems to successfully recreate the issue:

  1. Deploy new Ubuntu 18.04 server.
  2. Install Checkmk agent v2.1.0
  3. Purge Checkmk agent.
  4. Delete non-empty directories /var/lib/cmk-agent and /var/lib/check_mk_agent.
  5. Reinstall Checkmk agent.

I will update the thread title to more accurately reflect the situation.

I also updated the monitoring server from vanilla 2.1.0 to 2.1.0p16 and the issue persists.

The minimal case to reproduce gave me an idea, so I compared the contents of /var/lib/cmk-agent and /var/lib/check_mk_agent between a working host and the non-working one and I found that the following 2 files were missing from the broken host:

  • /var/lib/cmk-agent/allow-legacy-pull
  • /var/lib/cmk-agent/registered_connections.json

I stopped all Checkmk services, manually created those files with the same contents and permissions as the working service and restarted the Checkmk agent.

After this, the agent is now listening on port 6556 as expected.

So the issue seems to be that the Checkmk DEB package only creates those files the first time it is installed. This should be classified as a bug, in my opinion, but I don’t know where to go about raising it.

Thanks,
Jason

1 Like

Hallo,
you can seend a mail to the Checkmk Feedback System.
feedback@checkmk.com
Ralf

1 Like

Thank you, Ralf.

I submitted a description of the issue to the email address. Hopefully the Checkmk team will address the issue to save future users some trouble.

Regards,
Jason

I’m having exactly the same problem: Installation OK, no listener. Previous version of cmk-agent was installed. Cannot exclude to have accientally over installed a new version, w/o de-installing the previous.

But as a matter of fact: Service running, not listening to 6556

Things where working now, service is listening after restart. But only after I have registered the host to the server…

Can confirm the bug and the workaround mentioned by @jsmyth in version 2.1.0p25 and 2.2.0b2.