Timeout problem 1.6 CEE with CMC when activating changes

Hello everyone,

i have a timeout problem on our 1.6 CEE with CMC when activating changes.

If I want to activate the changes via WATO it runs 99% in a timeout.

“Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.”

Remedy or that the changes are taken over.
Only helps:

  1. cmk -R as site user
  2. followed by a cmk -O
  3. finally discard changes via WATO

and WATO’s clean again.

However, I can not understand where this problem comes from ?

I have this Problem since 1.6 from beta till now p7.

I also have made a new VM with a clean install, just the site i have transfered with backup - restore

Perhaps anybody have an Idee howto fix it or howto investigate where the problem comes from.

Greetz Bernd

output from cmk -R -vv (shorted)

Time needed [cmc_all_hosts]: 20.15 sec
Time needed [cmc_groups]: 0.00 sec
Time needed [cmc_groups]: 0.00 sec
Time needed [cmc_groups]: 0.00 sec
Time needed [cmc_stringlist]: 0.00 sec
Time needed [cmc_contactlists]: 0.00 sec
Not importing state from Nagios, /omd/sites/main/var/nagios/retention.dat not found.
/omd/sites/main/var/check_mk/core/config written.
Try aquire lock on /omd/sites/main/var/check_mk/stored_passwords
Got lock on /omd/sites/main/var/check_mk/stored_passwords
Releasing lock on /omd/sites/main/var/check_mk/stored_passwords
Released lock on /omd/sites/main/var/check_mk/stored_passwords
OK
Packing config…Try aquire lock on /omd/sites/main/var/check_mk/base/precompiled_check_config.mk.orig
Got lock on /omd/sites/main/var/check_mk/base/precompiled_check_config.mk.orig
Releasing lock on /omd/sites/main/var/check_mk/base/precompiled_check_config.mk.orig
Released lock on /omd/sites/main/var/check_mk/base/precompiled_check_config.mk.orig
OK
Restarting monitoring core…OK

and the output from cmk -O

Time needed [cmc_all_hosts]: 5.68 sec
Time needed [cmc_groups]: 0.00 sec
Time needed [cmc_groups]: 0.00 sec
Time needed [cmc_groups]: 0.00 sec
Time needed [cmc_stringlist]: 0.00 sec
Time needed [cmc_contactlists]: 0.00 sec
Not importing state from Nagios, /omd/sites/main/var/nagios/retention.dat not found.
/omd/sites/main/var/check_mk/core/config written.
Try aquire lock on /omd/sites/main/var/check_mk/stored_passwords
Got lock on /omd/sites/main/var/check_mk/stored_passwords
Releasing lock on /omd/sites/main/var/check_mk/stored_passwords
Released lock on /omd/sites/main/var/check_mk/stored_passwords
OK
Packing config…Try aquire lock on /omd/sites/main/var/check_mk/base/precompiled_check_config.mk.orig
Got lock on /omd/sites/main/var/check_mk/base/precompiled_check_config.mk.orig
Releasing lock on /omd/sites/main/var/check_mk/base/precompiled_check_config.mk.orig
Released lock on /omd/sites/main/var/check_mk/base/precompiled_check_config.mk.orig
OK
Reloading monitoring core…OK

top - 12:24:04 up 3 days, 5:03, 1 user, load average: 46.71, 44.69, 30.80
Tasks: 441 total, 111 running, 330 sleeping, 0 stopped, 0 zombie
%Cpu(s): 90.9 us, 9.1 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 24105.7 total, 5430.2 free, 9589.1 used, 9086.3 buff/cache
MiB Swap: 8188.0 total, 7771.7 free, 416.2 used. 10767.1 avail Mem

How many cores has your machine?
A load between 30 and 47 is really demanding for the system.
After you activate the config on the command line i would only remove the file with not activated changes.
A discard changes inside WATO can lead to an “unwanted” or “older” state of your configuration.
The files with the changes you find inside “~/var/check_mk/wato/”.

I had only problems activating changes from web after i modified or installed a new check.
Then a onetime “cmk -R” was needed. After this web activation worked without problem.

image

I know … because my disk was running full … :crazy_face:

[Errno 28] No space left on device: '/omd/sites/main/tmp/check_mk/…

now it`s more and unwanted files deleted

Hi @andreas-doehler ,

you was rigth with the load of the cpu

i`d have create a new site on a second server and transfered around 1k host to the new distributed site.

Now it`s working again, not so fast as expected but it works !!!

5K Hosts on Site1 with ~30K Services
1K Hosts on Site2 with ~10K Services

Hi everybody,

what are the different between activation changes over WATO and cmk -R

in WATO mostly i`m running into timeout

the execution time on the console are around 35 sec …

with cmk -O around 65 sec

Inside WATO you have the web server between and this leads to the timeout problem.
The activation is running in the background and also completes only the web server gets no response in time and as a result the WATO is not resetting the pending changes.

Now i`m a step forward

the Problem is nearly away by useing the classic theme