Ruleset "Agent pairing": what does it do?

r.sander · August 31, 2022, 3:32pm

CMK version: 2.1.0p11
OS version: Debian 11

What does the ruleset “Agent pairing” do?
From its value I assume it should be possible to do the TLS registration automatically with it.

When baking an agent package including this rule I get a config file /etc/check_mk/agent_pairing.cfg on the host which includes the automation user credentials and the root CA certificate of the site.
But the agent doe snot get registered.

The troubling thing is that after the update form 2.0 to 2.1 there is no process any more listening on port 6556. Why does this happen?

andreas-doehler · August 31, 2022, 7:12pm

This problem looks OS version and used package management dependent.
I made some test for @AndiU with different OS versions and got very different upgrade experiences.

Best result was an actual Debian and Ubuntu with agent bakery deployed 2.0 agents. They where working after the update without problem but without TLS.
Worst OS was Redhat/CentOS and SuSE where somewhere in between.

r.sander · September 1, 2022, 7:06am

My conclusion so far: I will not recommend to upgrade to 2.1 in reasonable sized infrastructures that use the agent updater. The amount of manual work to get the agent running again is just too much, @LaMi .

mike1098 · September 1, 2022, 8:02am

Following all the post here and the past werks I can fully agree to that. It looks like that the upgrade to 2.1 will become a nightmare in our big distributed environment.

A no go in enterprise environment. Primary Linux OS is RedHat in our Datacenter.

My feeling is that the new developments do not focus anymore on customer requirements and the priority is to push new features to make the product as attractive as possible.

Until now anyway we cannot migrate to 2.1 because of missing features and to be honest we still fight with issues in 2.0 because features had been removed (e.g. clustered checks).

regards

Michael

CFriedrich · September 1, 2022, 8:56am

Hi, we see it that way at our company as well. If and when we update to 2.1 is still completely unclear. The situation at our Company is the same, as it is on Mikes… we have redhat only.

mike1098 · September 1, 2022, 9:15am

I vote for that Tribe29 provide a long term support as it is available for Ubuntu to allow more time to upgrade for enterprise customers.

LaMi · September 1, 2022, 9:42am

For now, I can at least clarify the linux distro question:

Enterprise distributions are by no means low priority. We will not discontinue or weaken support for these distributions. We know they are important to you, so they are equally important to us.

Besides that, we all know that agent updates are important and need to be stable so that you can manage the many agents. Unfortunately, there seem to be some (distro specific?) technical issues that make the update to 2.1 more problematic than before. Be sure: It is important to us and we are taking care of it. Whenever you notice specific problems, let us know. We want to find these bugs and get them under control.

r.sander · September 1, 2022, 9:47am

I upgraded a test installation from 2.0 to 2.1 and after the agent updater did its thing the agent did not listen on port 6556 any more. Only after I manually ran “cmk-agent-ctl register” it listened again.

This was not expected as I created rules in “Agent controller” {'agent_ctl_enabled': True} and “Agent pairing” {'pairing_on_installation': True, 'server': 'checkmk-test.mauerpark.heinlein-intern.de', 'user': 'automation'} before baking the packages for the 2.1 version.

Host OS is Ubuntu 20.04 in this case.

AndiU · September 1, 2022, 10:14am

Thanks for the additional information.

First of all: Automatic registration doesn’t work yet, sorry. The agent pairing ruleset got added accidentally to Checkmk 2.1 and has no effect. It will be removed with p12: Remove preliminary agent pairing ruleset

Can you rule out that a 2.1 agent (and with it the agent controller) has been installed to that test system before?
On initial update from 2.0 to 2.1, the agent controller should work in legacy mode and allow non-TLS connections. But as soon as there is or has been an agent controller on the system, the legacy case won’t be detected anymore and the agent controller won’t open a port without registration.

r.sander · September 1, 2022, 11:08am

There once was a 2.1 agent on that test system, but I apt purged it and removed the remaining /etc/cmk-update-agent.state file. After that I installed the 2.0 and registered the agent updater again to the 2.0 test site.

AndiU · September 1, 2022, 11:30am

In this case, it’s the cmk-agent user that remains on the system.
On installation, the script that adds the user detects that it’s already there, and skips to activate the legacy mode.

andreas-doehler · September 1, 2022, 11:36am

It is not only a distribution specific problem. From my point of view the biggest problem is that there is no save fallback for the agent installation.

At my tests i had many systems where 2.0 was working without problem and with an simple upgrade (same bakery config as 2.0) the agent was broken. I think this is what also @mike1098 means with upgrade is not possible in his big environment.
In my only bigger environment where i updated to 2.1, i needed to disable the agent updater for all RedHat/CentOS machines to keep them working.

r.sander · September 1, 2022, 11:46am

Shouldn’t a postrm script remove this account in case of a purge?

AndiU · September 5, 2022, 1:50pm

Just like it’s the case for the /etc/cmk-update-agent.cfg file, the registration information of the agent controller stays behind on uninstallation. To be more precise: The /var/lib/cmk-agent folder won’t be deleted. As a consequence, the cmk-agent user also has to stay.
As the agent controller sees that there was an installation before, it won’t fall back to unencrypted communication.

Yes, we could aswell delete all data on agent uninstallation. It’s designed this way to make it easier to reenable the monitoring for a host at a later time. The Registration has to be done manually, and credentials have to be passed - It can be awkward to do this several times. By leaving the registration information on the hosts, they can easily be reactivated later on just by installing the agent package.

On the other hand, it could be argued that this is a unusual approach that one would not expect.

Complete removal on dpkg purge would be an idea, but would lead to inconsistencies between DEB and RPM based Linuxes.

However, we would be interested in your opinion. We made this approach a long time ago when we introduced the agent updater.
What do you think about this?
Should we continue to leave some data behind for convenience, or should we go for a cleaner approach and just delete all data on agent package uninstallation completely?
It’s been like it is for a while now, and maybe you’ve had some experiences that speak for one or another approach.

r.sander · September 5, 2022, 2:04pm

An apt purge has to remove all directories and config files.

It must be possible to start with a clean state after removing the package.

gstolz · September 5, 2022, 8:45pm

I agree with Robert, a “purge” is not called regularly or by default, but only if the admin specifically demands that all configs are removed, the agent should follow this demand. If I want the comfort of re-anbling the old config, I shouldn’t run “purge” but can still just de-install/remove

AndiU · September 6, 2022, 5:23am

Alright, got that so far.

What about the other way round? Do you think it’s useful to keep the registration files at all? Or should they vanish on apt remove aswell?
We are not talking about config files. Yes, /etc/cmk-update-agent.state is at /etc, but it’s not marked as a config file and is placed there at runtime. The files under /var/lib/cmk-agent obviously are no config files.

And what about RPM? We don’t have the remove/purge differentiation here. There is an automatic backup copy of modified config files on removal, but that doesn’t apply to our registration files.
Should we just remove the mentioned files on uninstallation on RPM systems?

mike1098 · September 6, 2022, 4:00pm

Basically the rule for RPM is that only the files which are installed by the package must be removed on uninstallation.
Nevertheless some words how to do a “Tabula Rasa” in the docs would be helpful.

regards

Michael

system · September 6, 2023, 4:01pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.