Hi. I’ve inherited the management of some CheckMK infrastructure that I’m trying to upgrade to get to a supported version. One central and 9 x remote instances in a distributed setup, all running the docker image. Originally they were on 2.0.0p16 and were upgraded to 2.0.0p39 with no issues. Now when I try and upgrade to 2.1.0p49 the upgrade fails. The docker logs show the upgrade starting:
2025-10-14T12:42:18.929921987Z 2025-10-14 12:41:59 - Updating site 'ny_1' from version 2.0.0p39.cee to 2.1.0p49.cee...
and goes through all steps finishing with:
2025-10-14T12:43:05.161518422Z -| 30/32 Fix registered hosts symlinks...
2025-10-14T12:43:05.162137827Z -| 31/32 Update license usage history...
2025-10-14T12:43:05.261438584Z -| 32/32 Synchronize automationuser secrets...
2025-10-14T12:43:07.643787764Z -| Done
2025-10-14T12:43:13.545067046Z Generating configuration for core (type cmc)...
2025-10-14T12:43:13.621893702Z Starting full compilation for all hosts Creating global helper config...OK
“omd status” then shows nothing is running. If I then try and restart it fails with:
Starting cmc…Failed (Config /omd/sites/ny_1/var/check_mk/core/config.pb missing, run “cmk -U” and try again)
The “~/var/check_mk/core/config.pb” file does not exist, I just have a 0 length config.pb.new.
If I run a “cmk -U --debug”, I get “Configuration Error: Other restart currently in progress. Aborting”.
I saw something similar mentioned here: Werk #16407: omd update: Don't Delete "config.pb" During Pre-Update but am a bit stuck on how to move forward. Does anyone have any ideas as to how I can resolve or debug this further?
I did setup a test install and the same upgrade worked fine, so I’m assuming there may be something in the configuration of the live systems that is causing the failure.
Many thanks.