I currently have a strange problem with the state of the notification spooler. My Checkmk service reports that the notification spooler is not running, in fact that it is running and working. I tried to troubleshoot the problem with the logs but nothing special there.
I can see the process running in htop and it is doing it’s job because I receive e-mails.
Here is a screenshot of the service problems that are not representing the right state of the spooler:
omd status output showed me that the MKNOTIFYD is running and OK. Like this:
OMD[SITENAME]:~$ omd status mkeventd: running liveproxyd: running mknotifyd: running rrdcached: running cmc: running apache: running crontab: running ----------------------- Overall state: running
As described on https://checkmk.com/cms_notifications.html
I took a look at …
… but nothing strange there even with the highest log level (debugging).
I also took a look at the agent output to find the problem. As you can see here.
<<<omd_status:cached(1579006293,60)>>> [SITENAME] mkeventd 0 liveproxyd 0 mknotifyd 0 rrdcached 0 cmc 0 apache 0 crontab 0 OVERALL 0 <<<mknotifyd:sep(0)>>> [SITENAME] Version: 1.5.0p23 Updated: 1579006331 (2020-01-14 13:52:11) Started: 1579005428 (2020-01-14 13:37:08, 903 sec ago) Configuration: 1579005428 (2020-01-14 13:37:08, 903 sec ago) Listening FD: None Spool: New Count: 0 Oldest: Youngest: Spool: Deferred Count: 0 Oldest: Youngest: Spool: Corrupted Count: 0 Oldest: Youngest: Queue: mail Waiting: 0 Processing: 0 (SITENAME,36508,11388,00:00:00/15:09,851) python /omd/sites/SITENAME/bin/mknotifyd
Here is a screenshot of my htop where you can clearly see that the MKNOTIFYD is running.
This is the state from the MKNOTIFYD:
I did a
omd site restart but the problem was still showing up and I also rebooted the whole virtual appliance but that helped neither nor. The problem is persistent.
Checkmk EE 1.5.0p23 running on the latest Checkmk Virtual Appliance Firmware (1.4.6) with baked agent installed and 1 distributed site connected.
Maybe it’s something related with the appliance or the serverside plugin is missunderstanding the agent output. idk ¯_(ツ)_/¯
I’m open for new ideas and help from you. Thanks in advance.
I censored some output for security reasons.