Since updating to 2.3.0, the activation process often hangs on one or more sites. This is especially likely when all sites have updates to process, e.g. after updating an extension package. To note: we have more than 50 sites in total.
What happens is:
Activation starts
Progress for some gets stuck at “activating”, for others at “synchronizing”
Some sites get to “failed” with error messages such as [Error 9] bad file descriptor or [SSL: WRONG_VERSION_NUMBER]
After waiting for maybe 2 minutes an error message is shown at the top: Unknown activation process
Here’s a screenshot of all of those occurring simultaneously:
In such a case I have to reload the activation page & try activation again. It often takes three, sometimes even four tries until all sites are successfully activated.
Which sites get stuck in which state is not deterministic, neither is which one ends in a “failed” state, and if so, with which of the two aforementioned error messages.
The “failed” state with one of those error messages also happens seemingly randomly during activation jobs when only a handful of sites have to be updated. In that case I haven’t seen sites being stuck in “activating” or “synchronizing” yet.
Hi,
We had the same problem that changes could not be activated. The problem for us was incorrect file permissions.
The file netstat.save is located in the web folder of the site. Check_mk no longer had access to this file as it suddenly belonged to the root user. A simple chown: netstat.save solved the problem.
In our case, the file was located under /omd/sites//var/check_mk/web
Thanks for the feedback. That’s interesting. Unfortunately it doesn’t apply to our situation. I’ve checked with the following shell snippet:
cd /opt/omd/sites
for site in * ; do
find $site "!" -user $site
done
Which is basically “for each site find every item not owned by that site’s user”. It turned up no hits on any of our sites. Dang, that would have been an easy fix
This means someone modified the file as root, which changed the permissions.
If you interact with a Checkmk site, always become the site user (omd su $SITE).