Trouble after upgrading to 2.1 agent

after upgrading to the new 2.1 version and register the agent as described in Monitoring Linux - The new agent for Linux in detail I get the following error: [agent] Host is registered for TLS but not using it

Am I missing something?

CMK version:
2.1
OS version:
ubuntu 20.04
Error message:
[agent] Host is registered for TLS but not using it**CRIT**
Output of “cmk --debug -vvn hostname”: (If it is a problem with checks or plugins)

Checkmk version 2.1.0
Try license usage history update.
Trying to acquire lock on /omd/sites/timdebruijn/var/check_mk/license_usage/next_run
Got lock on /omd/sites/timdebruijn/var/check_mk/license_usage/next_run
Trying to acquire lock on /omd/sites/timdebruijn/var/check_mk/license_usage/history.json
Got lock on /omd/sites/timdebruijn/var/check_mk/license_usage/history.json
Next run time has not been reached yet. Abort.
Releasing lock on /omd/sites/timdebruijn/var/check_mk/license_usage/history.json
Released lock on /omd/sites/timdebruijn/var/check_mk/license_usage/history.json
Releasing lock on /omd/sites/timdebruijn/var/check_mk/license_usage/next_run
Released lock on /omd/sites/timdebruijn/var/check_mk/license_usage/next_run
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.TCP
[cpu_tracking] Start [7f4e226d3fa0]
[TCPFetcher] Fetch with cache settings: DefaultAgentFileCache(rss.timdebruijn.lan, base_path=/omd/sites/timdebruijn/tmp/check_mk/cache, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Does not exist)
[TCPFetcher] Execute data source
Connecting via TCP to 192.168.1.15:6556 (5.0s timeout)
Detected transport protocol: TransportProtocol.PLAIN (b'<<')
Closing TCP connection to 192.168.1.15:6556
[cpu_tracking] Stop [7f4e226d3fa0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.08999999985098839))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7f4e226cd4c0]
[PiggybackFetcher] Fetch with cache settings: NoCache(rss.timdebruijn.lan, base_path=/omd/sites/timdebruijn/tmp/check_mk/data_source_cache/piggyback, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=True, use_outdated=False, simulation=False)
Not using cache (Cache usage disabled)
[PiggybackFetcher] Execute data source
Piggyback file '/omd/sites/timdebruijn/tmp/check_mk/piggyback/rss.timdebruijn.lan/vcsa.timdebruijn.nl': Successfully processed from source 'vcsa.timdebruijn.nl'
No piggyback files for '192.168.1.15'. Skip processing.
Not using cache (Cache usage disabled)
[cpu_tracking] Stop [7f4e226cd4c0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
+ PARSE FETCHER RESULTS
  Source: SourceType.HOST/FetcherType.TCP
  -> Not adding sections: Host is registered for TLS but not using it
  Source: SourceType.HOST/FetcherType.PIGGYBACK
<<<esx_vsphere_vm:cached(1653924216,90)>>> / Transition NOOPParser -> HostSectionParser
<<<labels:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
No persisted sections
  -> Add sections: ['esx_vsphere_vm', 'labels']
Received no piggyback data
Received no piggyback data
[cpu_tracking] Start [7f4e226cda90]
value store: synchronizing
Trying to acquire lock on /omd/sites/timdebruijn/tmp/check_mk/counters/rss.timdebruijn.lan
Got lock on /omd/sites/timdebruijn/tmp/check_mk/counters/rss.timdebruijn.lan
value store: loading from disk
Releasing lock on /omd/sites/timdebruijn/tmp/check_mk/counters/rss.timdebruijn.lan
Released lock on /omd/sites/timdebruijn/tmp/check_mk/counters/rss.timdebruijn.lan
CPU load             PEND - Check plugin received no monitoring data
CPU utilization      PEND - Check plugin received no monitoring data
Check_MK Agent       PEND - Check plugin received no monitoring data
Disk IO SUMMARY      PEND - Check plugin received no monitoring data
ESX CPU              demand is 0.147 Ghz, 1 virtual CPUs
ESX Datastores       Stored on nvme1-esxi2 (931.25 GB/91.7% free)
ESX Guest Tools      VMware Tools are installed, but are not managed by VMWare
ESX Heartbeat        Heartbeat status is green
ESX Hostsystem       Running on 192.168.1.32
ESX Memory           Host: 1.03 GB, Guest: 327.00 MB, Ballooned: 0.00 B, Private: 1.00 GB, Shared: 0.00 B
ESX Mounted Devices  HA functionality guaranteed
ESX Name             rss.timdebruijn.lan
ESX Snapshots        Count: 0
Filesystem /         PEND - Check plugin received no monitoring data
Filesystem /boot     PEND - Check plugin received no monitoring data
Interface 2          PEND - Check plugin received no monitoring data
Interface 3          PEND - Check plugin received no monitoring data
Kernel Performance   PEND - Check plugin received no monitoring data
Memory               PEND - Check plugin received no monitoring data
Mount options of /   PEND - Check plugin received no monitoring data
Mount options of /boot PEND - Check plugin received no monitoring data
Number of threads    PEND - Check plugin received no monitoring data
PostgreSQL ANALYZE MAIN/miniflux PEND - Check plugin received no monitoring data
PostgreSQL ANALYZE MAIN/postgres PEND - Check plugin received no monitoring data
PostgreSQL Bloat MAIN/miniflux PEND - Check plugin received no monitoring data
PostgreSQL Bloat MAIN/postgres PEND - Check plugin received no monitoring data
PostgreSQL Connection Time MAIN PEND - Check plugin received no monitoring data
PostgreSQL Connections MAIN/miniflux PEND - Check plugin received no monitoring data
PostgreSQL Connections MAIN/postgres PEND - Check plugin received no monitoring data
PostgreSQL DB MAIN/miniflux Size PEND - Check plugin received no monitoring data
PostgreSQL DB MAIN/miniflux Statistics PEND - Check plugin received no monitoring data
PostgreSQL DB MAIN/postgres Size PEND - Check plugin received no monitoring data
PostgreSQL DB MAIN/postgres Statistics PEND - Check plugin received no monitoring data
PostgreSQL DB MAIN/template1 Size PEND - Check plugin received no monitoring data
PostgreSQL DB MAIN/template1 Statistics PEND - Check plugin received no monitoring data
PostgreSQL Daemon Sessions MAIN PEND - Check plugin received no monitoring data
PostgreSQL Instance MAIN PEND - Check plugin received no monitoring data
PostgreSQL Locks MAIN/miniflux PEND - Check plugin received no monitoring data
PostgreSQL Locks MAIN/postgres PEND - Check plugin received no monitoring data
PostgreSQL Query Duration MAIN/miniflux PEND - Check plugin received no monitoring data
PostgreSQL Query Duration MAIN/postgres PEND - Check plugin received no monitoring data
PostgreSQL VACUUM MAIN/miniflux PEND - Check plugin received no monitoring data
PostgreSQL VACUUM MAIN/postgres PEND - Check plugin received no monitoring data
PostgreSQL Version main PEND - Check plugin received no monitoring data
Systemd Service Summary PEND - Check plugin received no monitoring data
Systemd Timesyncd Time PEND - Check plugin received no monitoring data
TCP Connections      PEND - Check plugin received no monitoring data
Uptime               PEND - Check plugin received no monitoring data
WireGuard wg0        PEND - Check plugin received no monitoring data
WireGuard wg0 Peer pdLb1t5BLDupVDIk3/DKzOmYYf6Ak3ioz9tBFWBT1G8 PEND - Check plugin received no monitoring data
Piggyback file '/omd/sites/timdebruijn/tmp/check_mk/piggyback/rss.timdebruijn.lan/vcsa.timdebruijn.nl': Successfully processed from source 'vcsa.timdebruijn.nl'
No piggyback files for '192.168.1.15'. Skip processing.
[cpu_tracking] Stop [7f4e226cda90 - Snapshot(process=posix.times_result(user=0.010000000000000009, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.010000001639127731))]
[agent] Host is registered for TLS but not using it(!!), [piggyback] Successfully processed from source 'vcsa.timdebruijn.nl', Missing monitoring data for plugins: checkmk_agent, cpu_loads, cpu_threads, df, diskstat, kernel_performance, kernel_util, lnx_if, mem_linux, mounts, postgres_bloat, postgres_conn_time, postgres_connections, postgres_instances, postgres_locks, postgres_query_duration, postgres_sessions, postgres_stat_database, postgres_stat_database_size, postgres_stats, postgres_version, systemd_units_services_summary, tcp_conn_stats, timesyncd, uptime, wireguard(!), execution time 0.1 sec | execution_time=0.100 user_time=0.010 system_time=0.000 children_user_time=0.000 children_system_time=0.000 cmk_time_agent=0.090

This was already performed: Monitoring Linux - The new agent for Linux in detail

Yes I did use the” cmk-agent-ctl register command, after that I got this message

Could you please check who is claiming port 6556?

ss -tulpn | grep 6556

This should be cmk-agent-ctl in daemon mode. If it is xinetd remove the Xinetd config file for the Checkmk agent and just reinstall the agent package. If it is systemd please give us the output of

systemctl --version
root@rss:/home/admin# ss -tulpn | grep 6556
tcp    LISTEN  0       4096                       *:6556                *:*      users:(("systemd",pid=1,fd=60))
root@rss:/home/admin# systemctl --version
systemd 245 (245.4-4ubuntu3.17)
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid

This looks like a leftover service from 2.0. Could you please stop the service:

systemctl status check-mk-agent.socket

Then confirm it is stopped:

ss -tulpn | grep 6556 # should be empty

Then reinstall the agent package and check whether the cmk-agent-ctl does now claim 6556. If this does not help we’ll dig through the post install script tomorrow.

Hi mschlenker,

Thank you for your suggestions. unfortunately it did not work:

root@rss:/home/admin# systemctl status check-mk-agent.socket
● check-mk-agent.socket - Local Checkmk agent socket
     Loaded: loaded (/lib/systemd/system/check-mk-agent.socket; enabled; vendor preset: enabled)
     Active: active (listening) since Mon 2022-05-30 18:46:54 UTC; 12min ago
   Triggers: ● check-mk-agent@0.service
     Listen: /run/check-mk-agent.socket (Stream)
   Accepted: 0; Connected: 0;
      Tasks: 0 (limit: 1066)
     Memory: 52.0K
     CGroup: /system.slice/check-mk-agent.socket

May 30 18:46:54 rss systemd[1]: Starting Local Checkmk agent socket.
May 30 18:46:54 rss systemd[1]: Listening on Local Checkmk agent socket.
root@rss:/home/admin# systemctl stop check-mk-agent.socket
root@rss:/home/admin# systemctl disable check-mk-agent.socket
Removed /etc/systemd/system/sockets.target.wants/check-mk-agent.socket.
root@rss:/home/admin# ls
check-mk-agent_2.1.0-1_all.deb  index.html
root@rss:/home/admin# dpkg -i check-mk-agent_2.1.0-1_all.deb
(Reading database ... 110354 files and directories currently installed.)
Preparing to unpack check-mk-agent_2.1.0-1_all.deb ...
Removing systemd units: check-mk-agent@.service, check-mk-agent-async.service, cmk-agent-ctl-daemon.service, check-mk-agent.socket
Unpacking check-mk-agent (2.1.0-1) over (2.1.0-1) ...
Setting up check-mk-agent (2.1.0-1) ...

Deploying systemd units: check-mk-agent@.service check-mk-agent-async.service cmk-agent-ctl-daemon.service check-mk-agent.socket
Deployed systemd
Creating/updating cmk-agent user account ...
Activating systemd unit 'check-mk-agent-async.service'...
Activating systemd unit 'cmk-agent-ctl-daemon.service'...
Activating systemd unit 'check-mk-agent.socket'...
Created symlink /etc/systemd/system/sockets.target.wants/check-mk-agent.socket → /lib/systemd/system/check-mk-agent.socket.
root@rss:/home/admin# ss -tulpn | grep 6556
tcp    LISTEN  0       4096                       *:6556                *:*      users:(("systemd",pid=1,fd=86))

Looking at the post install script, there might be confusion regarding

  • /usr/lib/systemd/system
  • /lib/systemd/system

On Ubuntu 20.04 one should be a symlink to the other. However this is not guaranteed after upgrade from older Ubuntu versions. If both directories are different and both directories contain config files with cmk or check-mk in its name: First uninstall, then remove the remaining config files, then reinstall.

1 Like

This is a fresh install of 20.04, never done a dist-upgrade…

So I did a apt purge check-mk-agent, and a reinstall of the agent, now it works:

root@rss:/home/admin# ss -tulpn | grep 6556
tcp    LISTEN  0       4096                       *:6556                *:*      users:(("cmk-agent-ctl",pid=21196,fd=9))
1 Like

To help us tracking down: Are you using CRE oder CEE? In case CEE: Vanilla agent or baked agent?

I’m using the CRE version.

1 Like

I too had problems after upgrading to 2.1 agent (Checkmk Raw Edition 2.1.0)
Systemd Service Summary Total: 153, Disabled: 2, Failed: 2, 2 services failed (cmk-agent-ctl-daemon, systemd-networkd-wait-online)**CRIT**
tls enabled but not being used
registration OK
but port 6556 in use

root@ssfdb1:/tmp#  ss -tulpn | grep 6556
tcp     LISTEN   0        4096                   *:6556                 *:*      users:(("**systemd**",pid=1,fd=82))
root@ssfdb1:/tmp# systemctl start cmk-agent-ctl-daemon.service
root@ssfdb1:/tmp# tail -f /var/log/syslog
May 31 05:04:05 hostname_redacted systemd[1]: Stopped Checkmk agent controller daemon.
May 31 05:04:05 hostname_redacted systemd[1]: Started Checkmk agent controller daemon.
May 31 05:04:06 hostname_redacted cmk-agent-ctl[22636]: ERROR [cmk_agent_ctl] **Address in use** (os error 98)

Fixed by

apt purge check-mk-agent
apt install ./check-mk-agent_2.1.0-1_all.deb
root@ssfdb1:/tmp# ss -tulpn | grep 6556
tcp     LISTEN   0        4096                   *:6556                 *:*      users:(("**cmk-agent-ctl**",pid=27176,fd=9))                                       
root@ssfdb1:/tmp#

Phew
That was painful
Thanks Timdebruijn
Julian

2 Likes

Hello together, i got the an Simular Issue after an Upgrade from the 2.0.0p24 to 2.1.0p2 RAW Edition.

"
Jun 17 10:57:15 nc systemd[1]: cmk-agent-ctl-daemon.service: Scheduled restart job, restart counter is at 2.
Jun 17 10:57:15 nc systemd[1]: Stopped Checkmk agent controller daemon.
Jun 17 10:57:15 nc systemd[1]: cmk-agent-ctl-daemon.service: Start request repeated too quickly.
Jun 17 10:57:15 nc systemd[1]: cmk-agent-ctl-daemon.service: Failed with result ‘exit-code’.
Jun 17 10:57:15 nc systemd[1]: Failed to start Checkmk agent controller daemon.
"

Tried to adjust the Time in here /etc/systemd/system/multi-user.target.wants/cmk-agent-ctl-daemon.service with an StartLimitInterval=30, but no changes.

Is the Post here OK or should i open a new one ?

2.1.0p4 should fix this issue.

Nope, it doesn’t. Updating check-mk-agent:all 2.0.0p20-1 -> 2.1.0p4-1 still leaves cmk-agent-ctl-daemon.service as failed on Debian 11 with Raw.

/var/log/dpkg.log
2022-07-01 09:15:23 startup archives install
2022-07-01 09:15:23 upgrade check-mk-agent:all 2.0.0p20-1 2.1.0p4-1
2022-07-01 09:15:23 status half-configured check-mk-agent:all 2.0.0p20-1
2022-07-01 09:15:23 status unpacked check-mk-agent:all 2.0.0p20-1
2022-07-01 09:15:23 status half-installed check-mk-agent:all 2.0.0p20-1
2022-07-01 09:15:24 status unpacked check-mk-agent:all 2.1.0p4-1
2022-07-01 09:15:24 configure check-mk-agent:all 2.1.0p4-1 2.1.0p4-1
2022-07-01 09:15:24 status half-configured check-mk-agent:all 2.1.0p4-1
2022-07-01 09:15:25 status installed check-mk-agent:all 2.1.0p4-1
/var/log/daemon.log
Jul  1 09:13:32 server systemd[1]: Started Checkmk agent (10.99.0.15:52748).
Jul  1 09:13:33 server systemd[1]: check_mk@24081-10.99.0.50:6556-10.99.0.15:52748.service: Succeeded.
Jul  1 09:14:33 server systemd[1]: Started Checkmk agent (10.99.0.15:52946).
Jul  1 09:14:33 server systemd[1]: check_mk@24082-10.99.0.50:6556-10.99.0.15:52946.service: Succeeded.
Jul  1 09:15:24 server systemd[1]: Reloading.
Jul  1 09:15:24 server systemd[1]: Reloading.
Jul  1 09:15:25 server systemd[1]: Started Checkmk agent - Asynchronous background tasks.
Jul  1 09:15:25 server systemd[1]: Reloading.
Jul  1 09:15:25 server systemd[1]: Starting Local Checkmk agent socket.
Jul  1 09:15:25 server systemd[1]: Listening on Local Checkmk agent socket.
Jul  1 09:15:25 server systemd[1]: Reloading.
Jul  1 09:15:25 server systemd[1]: Started Checkmk agent controller daemon.
Jul  1 09:15:25 server cmk-agent-ctl[3069433]: ERROR [cmk_agent_ctl] Failed to listen on TCP socket for incoming pull connections.
Jul  1 09:15:25 server cmk-agent-ctl[3069433]: Error with IPV6:
Jul  1 09:15:25 server cmk-agent-ctl[3069433]: Address in use (os error 98)
Jul  1 09:15:25 server cmk-agent-ctl[3069433]: Error with IPV4:
Jul  1 09:15:25 server cmk-agent-ctl[3069433]: Address in use (os error 98)
Jul  1 09:15:25 server systemd[1]: cmk-agent-ctl-daemon.service: Main process exited, code=exited, status=1/FAILURE
Jul  1 09:15:25 server systemd[1]: cmk-agent-ctl-daemon.service: Failed with result 'exit-code'.

looks like check_mk.socket is still holding port 6556; it is up for two weeks.

root@server:~# systemctl status check_mk.socket
* check_mk.socket - Check_MK Agent Socket
     Loaded: loaded (/etc/systemd/system/check_mk.socket; enabled; vendor preset: enabled)
     Active: active (listening) since Tue 2022-06-14 14:30:57 CEST; 2 weeks 2 days ago
     Listen: [::]:6556 (Stream)
   Accepted: 24094; Connected: 0;
      Tasks: 0 (limit: 231656)
     Memory: 0B
        CPU: 0
     CGroup: /system.slice/check_mk.socket

Jun 14 14:30:57 server systemd[1]: Listening on Check_MK Agent Socket.
``
1 Like

See my edit on above post about check_mk.socket. I think it is the culprit.

Kernel: Linux server 5.15.35-1-pve #1 SMP PVE 5.15.35-3 (Wed, 11 May 2022 07:57:51 +0200) x86_64 GNU/Linux
IPv6 s not disabled:

root@server:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0@if160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 0a:10:2d:df:7e:a6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.99.0.50/24 brd 10.99.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::810:2dff:fedf:7ea6/64 scope link 
       valid_lft forever preferred_lft forever
1 Like

OK, stopping the socket, purging the old install and making sure 6556 is free bedfore installing the 2.1 agent should do the trick.

systemctl stop check_mk.socket
1 Like

I know. :slight_smile:

But wouldn’t it be nice if the check-mk-agent would do that? Kind of the purpose of .deb files isn’t it? Updating an agent shouldn’t be this difficult. Also

2.1.0p4 should fix this issue.

It clearly doesn’t. Should I report an bug or something?

Tell me whether it was the vanilla agent included in all editions or an agent built with the bakery (post install and pre remove differs here) and I’ll file a bug internally.

As mentioned before, RAW edition. So no bakery. Old and new .deb file directly downloaded (wget) from my checkmk installation. Setup → Agents → Linux → Packaged Agents

and I’ll file a bug internally.

thanks!

1 Like