Checkmk Service (null) Returns On Many Servers

CMK version: 2.0.0p9
OS version: CentOS 7

Error message: A few times each day, mostly on busier servers of ours, the checkmk service will go stale and begin to return (null) across the board. Often this will cause almost all of the hosts on a given server to go stale.

One of the people on our team was recently doing a little bit of a look to try and see what happens to the server load at the same time. Whenever all of the hosts go stale, it’s not the fault of the load simply hitting a number where it can’t keep up and staying there. The load actually plummets down to next to nothing, stays there for an indeterminate amount of time, and then skyrockets when everything comes out of being stale to catch up. This was a bit of the opposite of what we expected it to do, we expected that load to always stay sky high.

We’re on 2.0.0p9 right now, planning an upgrade to 2.1.0p24 in the very near future. I’m curious if anyone’s had any new insight on this topic since the last forum post closed up in November of 2022 due to inactivity.

Thanks! Will post additional info as needed or requested here.

Output of “cmk --debug -vvn hostname”: (If it is a problem with checks or plugins)

This is a known problem with 2.0 and Nagios as core (RAW or also Enterprise).
The (null) happens at the time of activating changes.
With 2.1 don’t saw this problem.

Just executed the upgrade to 2.1.0p24 across the board, still seeing the (null) errors as well as a lot of stale services. Should I expect this to subside on its own, or what other settings should I be looking at?

Update. It looks like the 2.1 (null) errors I see how are static in the sense that it’s related to a service check actually not working, as opposed to the random ups and downs we saw before. I’d consider this issue fixed for myself at this point.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.