Cmk-agent can no longer be registered at server

**CMK version: 2.2.0p6.cee **
**OS version: Debian Bullseye **

**Error message: ERROR [cmk_agent_ctl] Error registering existing host at https://10.111.125.126:8001/mobilfunk

Caused by:
Request failed with code 403 Forbidden: Unauthorized - Details: Unauthorized to read the global settings **

Hi,

since a few weeks we are facing problems registering a new host with checkmk instance. With same instance we’ve already servers registered. The issue first occurred with the 2.2.0p5-cee, upgrade to p6 didn’t resolve the issue.

The user used for registering has admin privileges at checkmk and is able to see the global setting at the webui.

[root@services-devops.hardware:~#] cmk-agent-ctl register -s 10.111.***.*** -i mysite -H myhostname -U myuser
Attempting to register at 10.111.***.***, port 8001. Server certificate details:

PEM-encoded certificate:
-----BEGIN CERTIFICATE-----
*****
-----END CERTIFICATE-----

Issued by:
	Site 'mysite' local CA
Issued to:
	mysite
Validity:
	From Fri, 27 May 2022 13:04:53 +0000
	To   Wed, 27 Sep 3020 13:04:53 +0000

Do you want to establish this connection? [Y/n]
> Y

Please enter password for 'myuser'
> 
ERROR [cmk_agent_ctl] Error registering existing host at https://10.111.***.***:8001/mysite

Caused by:
    Request failed with code 403 Forbidden: Unauthorized - Details: Unauthorized to read the global settings

We already removed the host from checkmk and removed the agent with all checkmk artifacts and started with that host from scratch. The issue is still the same.

Hi,

can you check the role of the user you are passing with "-U myuser" ?
In checkmk 2.2 there should be a dedicated role agent_registration - Agent registration user for that.
If your user is an automation account, you need to pass "-P <Automation Secret>"

Roles........................................................................................................................................................................................................	
Administrator
Guest user
Normal monitoring user

The Administrator role has at least the following privileges set:

Global settings
Register Host & download monitoring agents of your hosts
Register all hosts & download all monitoring agents

Since we upgraded to checkmk 2.2 we still were able to register hosts with the admin users.

Additional Info: I’ve right now created an user with the agent_registration-Role. Using this user shows the same error.

1 Like

Can you take a look into ~/var/log/web.log and see, if there are any additional errors logged during registration ?

10.111.125.21 - - [03/Aug/2023:16:16:03 +0200] "GET /mobilfunk/check_mk/api/1.0/domain-types/internal/actions/discover-receiver/invoke HTTP/1.1" 200 4 "-" "-"
- - - [03/Aug/2023:16:16:09 +0200] "GET /mobilfunk/check_mk/api/1.0/agent_controller_certificates_settings HTTP/1.1" 403 94 "-" "python-requests/2.31.0"

That’s from ~/var/log/apache/access_log. In web.log there is nothing log related to the registration.
To clarify the log excerpt to the start post:

  • 10.111.125.21 is the host that shall be registered
  • mobilfunk is the real value for mysite.

Are you sure that you are connecting to the right site ?

omd config mobilfunk show | grep AGENT_RECEIVER_PORT

# omd config mobilfunk show | grep AGENT_RECEIVER     
AGENT_RECEIVER: on
AGENT_RECEIVER_PORT: 8001

Yes, it’s the right site.

You can try to raise the loglevel for web, agent registration and authentication in the global settings, to see if we find something there.
But be careful and don´t forget to reset the values after testing, as this will create a lot of log data.

I’ve raised the mentioned logging setting to debug.

web.log

2023-08-04 10:09:05,579 [10] [cmk.store 2829] Releasing all locks
2023-08-04 10:09:05,579 [10] [cmk.store 2829] Acquired locks: {}
2023-08-04 10:09:05,579 [10] [cmk.web 2829] Disconnecing site connections
2023-08-04 10:09:06,969 [10] [cmk.store 3040] Trying to acquire lock on /omd/sites/mobilfunk/var/check_mk/web/automation/num_failed_logins.mk
2023-08-04 10:09:06,969 [10] [cmk.store 3040] Got lock on /omd/sites/mobilfunk/var/check_mk/web/automation/num_failed_logins.mk
2023-08-04 10:09:06,970 [10] [cmk.store 3040] Releasing lock on /omd/sites/mobilfunk/var/check_mk/web/automation/num_failed_logins.mk
2023-08-04 10:09:06,970 [10] [cmk.store 3040] Released lock on /omd/sites/mobilfunk/var/check_mk/web/automation/num_failed_logins.mk
2023-08-04 10:09:06,984 [10] [cmk.store 3040] Releasing all locks
2023-08-04 10:09:06,984 [10] [cmk.store 3040] Acquired locks: {}
2023-08-04 10:09:06,984 [10] [cmk.web 3040] Disconnecing site connections

agent-receiver/agent-receiver.log

2023-08-04 10:09:06,986 [40] [agent-receiver 2671] uuid=3f0dddb8-21b8-4d88-ad60-feefdc32a0ff Querying agent controller certificate settings failed. Error message: Unauthorized - Details: Unauthorized to read the global settings

agent-receiver/access.log

::ffff:10.111.125.21:56618 - "POST /register_existing HTTP/1.1" 403

agent-receiver/error.log

[2023-08-04 10:05:58 +0200] [28586] [INFO] Shutting down: Master
[2023-08-04 10:05:58 +0200] [2666] [INFO] Starting gunicorn 20.1.0
[2023-08-04 10:05:58 +0200] [2666] [INFO] Listening at: http://[::]:8001 (2666)
[2023-08-04 10:05:58 +0200] [2666] [INFO] Using worker: agent_receiver.worker.ClientCertWorker
[2023-08-04 10:05:58 +0200] [2671] [INFO] Booting worker with pid: 2671
[2023-08-04 10:05:58 +0200] [2671] [INFO] Started server process [2671]
[2023-08-04 10:05:58 +0200] [2671] [INFO] Waiting for application startup.
[2023-08-04 10:05:58 +0200] [2671] [INFO] Application startup complete.

Unfortunately i can´t figure out from the logs, what the root cause is.
If you have full administrator rights, did not mess up the roles and still get the 403 and unauthorized messages, maybe you should open a support case.
Or hopefully someone else has a good idea :slight_smile:

Hello, had the same problem here and after tests i think is an agent problem (in my case version 2.2.0p7).

I do a clean installation of Checkmk Server 2.1.0p29 with the same version of agent and have no registration problem, but when i do a clean installation for 2.2.0p7 server with updated agent version server report 403 error on agent registration.

I try to register with 2.2.0p7 server and 2.1.0p29 agent and works fine.

update:
I installed agent specifying the clean installation option and disabling the two other checkboxes. After this the problem was solved.

I have a similar problem with a CME site (i.e. Managed Services Edition). I’m running 2.2.0p9.cme and I’m no longer able to register new hosts with an automation user “cmkautomation” that I created a while ago (with role “agent_registration”).

Upon first try, “cmk-agent-ctl register …” failed with this error: "Request failed with code 500 Internal Server Error: Internal Server Error"

  1. In ~/var/log/agent-receiver/error.log, there was an error:

File “/omd/sites/main/lib/python3.11/site-packages/agent_receiver/utils.py”, line 76, in internal_credentials
secret = (users_dir() / INTERNAL_REST_API_USER / “automation.secret”).read_text().strip()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/pathlib.py”, line 1058, in read_text
with self.open(mode=‘r’, encoding=encoding, errors=errors) as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/main/lib/python3.11/pathlib.py”, line 1044, in open
return io.open(self, mode, buffering, encoding, errors, newline)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: ‘/omd/sites/site47/var/check_mk/web/automation/automation.secret’

Looking at the code, INTERNAL_REST_API_USER is hardcoded to “automation”.

→ Does this mean “cmk-agent-ctl register” will always use the user “automation”, regardless of the user set on the command line?

  1. There was no user “automation” on my system, so I got this error. Obviously the user “automation” is supposed to exist, but I guess this got lost during the updates? (the site lives since 1.6).

  2. When I created an user “automation”, as a global user and with role “agent_registration” and used this user on the command line, “cmk-agent-ctl register” failed with an other error: "Request failed with code 403 Forbidden: Unauthorized - Details: Unauthorized to read the global settings"

  3. The user role “agent_registration” doesn’t have the permission “Global settings”.
    When I enabled those permission for “agent_registration”, the registration finally worked.

So there were three problems for our site:

  • Registration is hardcoded to user “automation”?? → Really? Am I missing something here?
  • User “automation” may be missing from your system? → Update problem?
  • Role “agent_registration” lacks permission “Global settings” → Bug?

I’m not sure whether some of these problems might be related to the Managed Services Edition.

No but internal jobs need an automation account with name “automation”.
If this is an older upgraded CMK it is possible that there is no “automation” Automation user.

The system “automation” user needs to be an admin user account.
For the registration itself you can use an restricted user.

3 Likes

Hmm, makes sense…

Tried it and indeed now it works with a plain “agent_registration” user when “automation” exists and has role “admin”.

Thanks!

1 Like

this is it! crazy…
Thx one more time @andreas-doehler

Just found out, that the automation user must not only exist, it also needs to have Customer: global, set in case you are running the cme version and want to register to a satellite server.

Right! This is how it looks like on my site.

For me, in order to register hosts, automation user needed to have role agent-registration and then add the following rights in plus: Agent pairing, Read and Write access to all hosts and folders and Global settings.