Hi Forum,
I currently have a strange problem with the state of the notification spooler. My Checkmk service reports that the notification spooler is not running, in fact that it is running and working. I tried to troubleshoot the problem with the logs but nothing special there.
I can see the process running in htop and it is doing it’s job because I receive e-mails.
Here is a screenshot of the service problems that are not representing the right state of the spooler:
The omd status
output showed me that the MKNOTIFYD is running and OK. Like this:
OMD[SITENAME]:~$ omd status
mkeventd: running
liveproxyd: running
mknotifyd: running
rrdcached: running
cmc: running
apache: running
crontab: running
-----------------------
Overall state: running
As described on https://checkmk.com/cms_notifications.html
I took a look at …
var/log/notify.log
var/log/mknotifyd.log
… but nothing strange there even with the highest log level (debugging).
I also took a look at the agent output to find the problem. As you can see here.
<<<omd_status:cached(1579006293,60)>>>
[SITENAME]
mkeventd 0
liveproxyd 0
mknotifyd 0
rrdcached 0
cmc 0
apache 0
crontab 0
OVERALL 0
<<<mknotifyd:sep(0)>>>
[SITENAME]
Version: 1.5.0p23
Updated: 1579006331 (2020-01-14 13:52:11)
Started: 1579005428 (2020-01-14 13:37:08, 903 sec ago)
Configuration: 1579005428 (2020-01-14 13:37:08, 903 sec ago)
Listening FD: None
Spool: New
Count: 0
Oldest:
Youngest:
Spool: Deferred
Count: 0
Oldest:
Youngest:
Spool: Corrupted
Count: 0
Oldest:
Youngest:
Queue: mail
Waiting: 0
Processing: 0
(SITENAME,36508,11388,00:00:00/15:09,851) python /omd/sites/SITENAME/bin/mknotifyd
Here is a screenshot of my htop where you can clearly see that the MKNOTIFYD is running.
This is the state from the MKNOTIFYD:
I did a omd site restart
but the problem was still showing up and I also rebooted the whole virtual appliance but that helped neither nor. The problem is persistent.
Setup:
Checkmk EE 1.5.0p23 running on the latest Checkmk Virtual Appliance Firmware (1.4.6) with baked agent installed and 1 distributed site connected.
Maybe it’s something related with the appliance or the serverside plugin is missunderstanding the agent output. idk ¯_(ツ)_/¯
I’m open for new ideas and help from you. Thanks in advance.
I censored some output for security reasons.