Problem: Checkmk 1.6.0p18 stuck when activating changes. Cannot discard changes, cmk -O and cmk -R don’t work (“Other restart currently in progress. Aborting.”). If I stop omd, I cannot boot it again and have to restore from backup (has happened before).
So, with the problem above, I will describe the scenario in which it happens, because it has happened before, to see if someone can help understand why this is happening.
Scenario: I have a lot of active checks configured, and sometimes we have to add hundreds more active checks, so I edit rules.mk file directly, and “force a reload” using the following method:
- edit rules.mk file, making sure the file formatting is kept
- I go to “Host Services and Parameters”, then “Active checks”, then “Check http service” to see if any rules.mk parsing errors show up. If there are none, I should be able to see the new http checks that I have added to rules.mk, but they are not yet “loaded”. If at this moment I do omd stop or reload, these new http checks don’t show, thats why I do step 3:
- To “force” the reload of the http checks, I go to “hosts”, click on any host, then just click “save and finish”.
- I go to the changes menu, and apply the changes.
Most of the time the changes apply successfully, with no problem, and checkmk is reloaded with the new http checks configured, but a few times it has happened what I described:
- Changes don’t finish applying
- Checkmk is still working, responsive, doing everything and working fine, but without the new checks.
- No subsequent changes can be applied because they go in a “queue”, and the first applying never finishes
- Cannot discard changes
- If I stop and start omd, configuration crashes (log below).
I don’t know where to look for something that might be happening to crash it. It’s not a rules.mk problem, because after restoring from backup, I will do the same edit, and it will work fine.
OMD[supcdteste]:~/var/check_mk/wato/log$ omd start
Starting mkeventd…OK
Starting liveproxyd…OK
Starting mknotifyd…OK
Starting rrdcached…OK
Starting cmc…Failed (Config /omd/sites/supcdteste/var/check_mk/core/config missing, run “cmk -U” and try again)
Starting apache…OK
Starting dcd…OK
Initializing Crontab…OK
OMD[supcdteste]:~/var/check_mk/wato/log$ cmk -U
Generating configuration for core (type cmc)…Process Process-2:
Traceback (most recent call last):
File “/omd/sites/supcdteste/lib/python2.7/multiprocessing/process.py”, line 267, in _bootstrap
self.run()
File “/omd/sites/supcdteste/lib/python2.7/multiprocessing/process.py”, line 114, in run
self._target(*self._args, **self._kwargs)
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 635, in wrapper
return func(*args, **kwargs)
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 647, in get_host_configurations
result = [host_class(hostname).get_serialized_data() for hostname in hostlist]
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 931, in init
host_macros={})
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 711, in init
self._compute()
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 949, in _compute
self._cmc_services()
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 1101, in _cmc_services
active_check_name, params)
File “/omd/sites/supcdteste/lib/python/cmk_base/config.py”, line 905, in active_check_service_description
description = act_info"service_description"
File “/omd/sites/supcd/share/check_mk/checks/check_http”, line 266, in check_http_description
description = params[“name”]
Exception: ‘name’
Original Traceback (most recent call last):
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 635, in wrapper
return func(*args, **kwargs)
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 647, in get_host_configurations
result = [host_class(hostname).get_serialized_data() for hostname in hostlist]
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 931, in init
host_macros={})
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 711, in init
self._compute()
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 949, in _compute
self._cmc_services()
File “/omd/sites/supcdteste/lib/python/cmk_base/cee/core_cmc.py”, line 1101, in _cmc_services
active_check_name, params)
File “/omd/sites/supcdteste/lib/python/cmk_base/config.py”, line 905, in active_check_service_description
description = act_info"service_description"
File “/omd/sites/supcd/share/check_mk/checks/check_http”, line 266, in check_http_description
description = params[“name”]
KeyError: ‘name’