Enabled Automatic Updates, Failures Everywhere

CMK version: Checkmk Free Edition 2.1.0p20
OS version: Debian 11

Error message:
Various from mulitple sources:
Agent controller not registered
[agent] Error establishing TLS connection
[piggyback] Source ‘’ not sending piggyback data, Got no information from host

No response when registering any agents:

Output of “cmk --debug -vvn hostname”: (If it is a problem with checks or plugins)

Checkmk version 2.1.0p20
Try license usage history update.
Trying to acquire lock on /omd/sites/plainview/var/check_mk/license_usage/next_run
Got lock on /omd/sites/plainview/var/check_mk/license_usage/next_run
Trying to acquire lock on /omd/sites/plainview/var/check_mk/license_usage/history.json
Got lock on /omd/sites/plainview/var/check_mk/license_usage/history.json
Next run time has not been reached yet. Abort.
Releasing lock on /omd/sites/plainview/var/check_mk/license_usage/history.json
Released lock on /omd/sites/plainview/var/check_mk/license_usage/history.json
Releasing lock on /omd/sites/plainview/var/check_mk/license_usage/next_run
Released lock on /omd/sites/plainview/var/check_mk/license_usage/next_run
Failed to lookup IPv4 address of hostname via DNS: [Errno -2] Name or service not known

I am trying out Checkmk for the first time. Installed two days ago, monitoring a dozen hosts including a single VMware ESXi host. Everything working beautifully.

Followed along with: Episode 32: Working with the Agent bakery in Checkmk

Attempted to re-register very first agent on the checkmk box itself and noticed I did not receive a “Successfully registered agent” message. Nor did I receive it on other hosts past that.

Example Windows machine:
Check_MK service: [agent] Agent controller not registeredCRIT, execution time 0.0 sec

C:\Program Files (x86)\checkmk\service>cmk-agent-ctl status
Version: 2.1.0p20
Agent socket: operational
IP allowlist: any


Connection: checkmk.<internal domain>:8000/plainview
        UUID: 705119de-bf22-4e27-844a-44c5696d7dac
        Local:
                Connection type: pull-agent
                Certificate issuer: Site 'plainview' local CA
                Certificate validity: Sat, 11 Feb 2023 20:18:59 +0000 - Thu, 14 Jun 3021 20:18:59 +0000
        Remote:
                Connection type: pull-agent
                Registration state: operational
                Host name: checkmk.<internal domain>

C:\Program Files (x86)\checkmk\service>

Example Linux machine (running checkmk):
Check_MK service: [agent] Error establishing TLS connection

On Host:

root@checkmk:~# cmk-agent-ctl status
Version: 2.1.0p20
Agent socket: operational
IP allowlist: any


Connection: checkmk.<internal domain>:8000/plainview
        UUID: df55db24-5ad9-4c27-acdf-0006e52d0376
        Local:
                Connection type: pull-agent
                Certificate issuer: Site 'plainview' local CA
                Certificate validity: Sat, 11 Feb 2023 20:09:13 +0000 - Thu, 14 Jun 3021 20:09:13 +0000
        Remote:
                Error: Request failed with code 404 Not Found: Host is not registered (!!)

For the errors that state the host is not registered I attempted to delete them entirely within Checkmk followed by cmk-agent-ctl delete-all. They promptly return to being no registered.

I followed the YouTube video very specifically on my first go, though I skipped the step regarding “Certificates for HTTPS verification” because I’d uploaded the CA certificate for the web host to Global settings > Trusted certificate authorities for SSL.

I have modified this rule now to include the same CA certificate that is listed in the Global settings as well. I then removed all of those hosts from Checkmk, uninstalled the agents, and reinstalled the agents. After the reinstall, no agents seem to listen.

On the Linux server hosting checkmk for example:

Communication failed: [Errno 111] Connection refused


root@checkmk:~# systemctl status cmk-agent-ctl-daemon.service
● cmk-agent-ctl-daemon.service - Checkmk agent controller daemon
     Loaded: loaded (/lib/systemd/system/cmk-agent-ctl-daemon.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2023-02-11 17:22:44 EST; 6min ago
   Main PID: 603 (cmk-agent-ctl)
      Tasks: 3 (limit: 2938)
     Memory: 2.7M
        CPU: 6ms
     CGroup: /system.slice/cmk-agent-ctl-daemon.service
             └─603 /usr/bin/cmk-agent-ctl daemon

Feb 11 17:22:44 checkmk systemd[1]: Started Checkmk agent controller daemon.
root@checkmk:~# systemctl status cmk-agent-ctl-daemon.service
● cmk-agent-ctl-daemon.service - Checkmk agent controller daemon
     Loaded: loaded (/lib/systemd/system/cmk-agent-ctl-daemon.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2023-02-11 17:22:44 EST; 6min ago
   Main PID: 603 (cmk-agent-ctl)
      Tasks: 3 (limit: 2938)
     Memory: 2.7M
        CPU: 6ms
     CGroup: /system.slice/cmk-agent-ctl-daemon.service
             └─603 /usr/bin/cmk-agent-ctl daemon

Feb 11 17:22:44 checkmk systemd[1]: Started Checkmk agent controller daemon.
root@checkmk:~# ss -atpun | grep 6556
root@checkmk:~#

You will not get a message when registering via the agent controller. It will just return the prompt after you accept the cert

The TLS registration is cached, so you won’t see the changes directly, just have patience :slight_smile:

If you get connection refused you have introduced another issue. The SSL cert you have imported have nothing to do with the TLS cert. The TLS cert comes, as you perhaps saw in the status form a INTERNAL CA that is only used for the agent TLS comm.

The SSL cert is ised in Agent bakery and automatic updates, as well as when you register for the automatic updates (That is not done using the agent controller)

The only case when the SSL cert might be used is for getting the TLS port (8000 normally) from your site, the agent controller asks the RestAPI for what port is being used, and it will try http and https, if https is only available you need to have your certs in the Agent…

3 Likes

You’re absolutely correct, patience is key! Thank you for your explanation on TLS caching and relieving my concern about not receiving a response after registering. A lot of documentation I see online says it should return a message saying it was successful and considering the problem is that it doesn’t appear to be registered it seems alarming.

Admittedly, this morning I just started over. It’s a brand new linux machine anyway, so reloading isn’t a big deal, I’d spend even more time diagnosing the software instead of using it!

I created a new site, configured HTTPS as suggested in Securing the web interface with HTTPS then immediately followed the prerequisites in Automatic updates which upon my second go is remarkably straight forward.

This time I replaced the certificate on the client as well and set the check in times for 1 minute. In both cases I let the checmk interface retrieve the certificate directly from the web server as shown below. In addition, I’ve also added the CA certificate to the server’s trusted CA list.

So, at this point I believe I have a stock checkmk configuration aside from a new site called plainview, HTTPS configured on Apache, and Agent updater enabled.

I finally installed the new agent on the checkmk instance–I previously did not install stock agent. This time I did indeed have patience and waited. Alas, I do not know how much time I should wait for the TLS cache to be updated. It has been just over 2 hours with no change:

user@checkmk:~$ sudo cmk-agent-ctl register --hostname checkmk.<internal domain> --server checkmk.<internal domain> --site plainview --user cmkadmin
Attempting to register at checkmk.<internal domain>:8000/plainview. Server certificate details:

PEM-encoded certificate:
-----BEGIN CERTIFICATE-----
MIIC7jCCAdagAwIBAgIUE2ck9CDSDAl3C5dq9csnBj6NGqEwDQYJKoZIhvcNAQEL
BQAwJDEiMCAGA1UEAwwZU2l0ZSAncGxhaW52aWV3JyBsb2NhbCBDQTAgFw0yMzAy
MTIxNzE5NDRaGA8zMDIxMDYxNTE3MTk0NFowFDESMBAGA1UEAwwJcGxhaW52aWV3
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzcccmf7LyGYlrZrWJNuT
2+Vb/bz/rOOCC0q2rf425U8AcTR1BASkEPGf3gDG21w1gZBPVVsdhcmQxxk9buvE
46/d1ezp17xzNU5nKdOvTm7/ZDlktC7aC1szNOwEXK/6lRI/tcFW0FraoEixuPJ6
m12b//RxSi6fDu0Kn64S8NF///3N/bQPzMUmyE10oFlgzWQ3knMRr88uPLUthiVQ
xfb4NjOnwDA9lHmAwcNR08Y5Qz3VD0w8FKWcd4N6HjUkDdKBf+Glv6/DN+19+tQy
QJajKhAqCeUYj4/2CNEO9Z8+ME4gDQZ3PT9m9WBWr8ebNuUhw3r4qGKUMEUWiGQb
FQIDAQABoyYwJDAUBgNVHREEDTALgglwbGFpbnZpZXcwDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQsFAAOCAQEAKjm3gv92VkdBOQ01XdjwvBZoJhFra/lXbIcAf8Yi
sjT2cLL6ACUU5MDxUouxBItDJfgqByZ2nkF5RR+CvXvC0T/liXjXz5PCvHkGQ62u
VFRCbgFatgFubbgvXawnAwOEGoyRPHIwD9B6XwPjjGRPzeexPpQY0UhWKYlqRWP3
eZ22XhZQ223GkIUhLdMujElspp7BqhxkYFeJ28+tTO1QXe8JUJxtBGwcY63vvWlV
M/Sv4F8BqaeIouDfVFpo2AAzuxfL+RNBSAr9uHow4Dl8I4Y9qaSAk+fLVDJ331mT
nlnUrxQiHtPkelBoAVpV70YnXDc7AmrR1w/it/JPgaJWAA==
-----END CERTIFICATE-----

Issued by:
        Site 'plainview' local CA
Issued to:
        plainview
Validity:
        From Sun, 12 Feb 2023 17:19:44 +0000
        To   Fri, 15 Jun 3021 17:19:44 +0000

Do you want to establish this connection? [Y/n]
> y

Please enter password for 'cmkadmin'
>
user@checkmk:~$

A few hours later:

user@checkmk:~$ sudo cmk-agent-ctl status
Version: 2.1.0p21
Agent socket: operational
IP allowlist: any


Connection: checkmk.<internal domain>:8000/plainview
        UUID: b454ad96-eacf-4c44-b654-fc2d3fcdfae5
        Local:
                Connection type: pull-agent
                Certificate issuer: Site 'plainview' local CA
                Certificate validity: Sun, 12 Feb 2023 18:04:40 +0000 - Fri, 15 Jun 3021 18:04:40 +0000
        Remote:
                Connection type: pull-agent
                Registration state: operational
                Host name: checkmk.<interal domain>

But at least the client is responding in the connectivity checks now (checkmk.<internal domain> resolves to 127.0.1.1 on the server, not elsewhere):

I am unsure of how to proceed here. Is perhaps using the default cmkadmin user for registering the agents a problem?

Ultimately, Checkmk was showing the agents were not registered because they were not in fact registered. I confused the keyword register on cmk-agent-ctl register with cmk-update-agent register or perhaps on some subconscious level assumed the first would handle both.

Thank you again Anders! My humility and I will go back into the shadows!

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.