Are there any instructions or procedure on howto update a distributed setup?
We are currently using the CheckMK applicance for our master nodes in a HA-setup.
Then we have 8 distributed nodes running Centos connected to the master nodes.
We want to update from 1.5.0 -> 1.6.0 but I can’t find any information on how to do this in a distributed setup.
Do we update the distributed nodes first or the master-nodes?
Why not the other way around? I thought it would make more sense because of the backwards compatibility.
I always patch the master first, then the satellites.
Are there problems to be expected if I do it like that?
For the update 1.5 to 1.6 it is not relevant what you do first. But keep in mind that it is not possible to activate any changes correctly with a mixed version system. @r.sander already said “Do not activate any changes in between”
You can say if the update is incompatible like 1.5 to 1.6 then you need to patch all systems at the same time. If you start with the master or slave is not relevant.
This worked fine for me, just some small extra steps.
Made a full backup on the master node.
Disabled all notifications and then started with a single distributed site first.
I then did a site copy, just in case.
As no config conflicts were detected on the first distributed node I just updated the rest of the distributed nodes.
And lastly the master node.
Let everything settle down, I had to re-inventory a few hosts that were using piggybacked k8s data (Guessing due to Werk #11263)
Then I re-enabled all notifications again.
Would be nice if https://checkmk.com/cms_update.html had a section with a note on recommendations/steps for distributed setups.
And perhaps a matrix showing what versions are backwards compatible and not.
Maybe there is such a page somewhere and I’m just missing it?