I’m recently upgraded to 2.0.0p35.cre on Ubuntu 20.04.
I’m not 100% sure if it is related or if I did some related change after the upgrade, but I don’t get notifications via email anymore. The mail spooler still works.
When I look at a recent service warning, follow things are shown:
|Currently in downtime|no|
|Current check attempt|1/1|
|The last time the service was OK|41 h|
|Service check latency|0.00 s|
|Service check duration|0.00 s|
|Service notification period|24X7|
|In notification period|yes|
|Service notification number|1|
|The time of the last service notification|-|
|Notification postponement reason||
So as far as I see, a notification should have been sent. Any idea what goes wrong or where I can start to look for this?
@eric76
First, hi.
Second, did you check your notification rules? Perhaps a condition, which is new, disables the rules for your hosts?!
And i’m a bit confused, because “The time of the last service notification” says “-”? Did you send Mails before the update and this value just got erased?
I think, we need to take a closer look on your host and the notification settings of yours to further troubleshoot this here.
Before the update everything was working fine. Although I don’t exclude that I did some change that has this unintented behaviour.
Yesterday evening something interesting happened. A full host went down and I received a notification! So the problem that notifications don’t get sent out seems to be only happening with services of hosts.
I had 3 notification rules, 2 for test servers and one “catch all”. I moved the notify all to the top, the “Notify all contacts of a host/service via HTML email”. It has no conditions, notify all contacts of the notified object.
I didn’t (knowingly) removed any values. I did a test and filled a filesystem that was never full, so that it triggered a critical.
And it also shows the “-” value next to the “time of last service notification”.
Did you recieved a notification for the critical filesystem?
And as you mentioned, “all contacts of the notified object”. Are you a member of all contact groups or do you use “Everyone”?
No, I didn’t receive a notification for the critical filesystem (service). I only received an email alert yesterday for the “Host down” (host alert) - and a recovery alert when it got up again.
I am a member of all contact groups, but to be 100% sure and to exclude a problem in groups, I checked the tickbox “Notify all users” before triggering the filesystem alert.
As described by you, you only get host-related notifications, but no service related.
Could you test that, with “fake check results”, if this really happens?
Also, if you check the “Notification History” of your services, you should be able to see, what “procedure” was plannend and in your “notify.log”, what exactly happened, or not happened.
I think, we need more debug-information, to further solve this problem.
I tried fake check results. But I didn’t get a notification either.
The notify.log is empty. In the notify.1 log I see yesterdays notification:
2023-05-02 17:53:30,495 [20] [cmk.base.notify] Got raw notification (manage) context with 36 variables
When I click on the service that I triggered today (the one where I filled the filesystem) and when I go to the history (Monitor, Overview, all hosts, dbtest, Services of Host, Service, Service Notifications dbtest, Filesystem /), the log is empty.
When I click on another host with a service that triggered a notification in the past (before the upgrade), I see that a notification has been sent (in this case, on Friday, 2023-03-03).
It’s late for that question, but do you use distributed monitoring server?
If yes, you need to check their logs regarding their hosts they monitor.
If you debug in the notification overview, you should be able to see incoming events your checkmk sites generates. If the rule “Notify all users” without conditions really applies, it can’t be that the notify.log is empty.
The “mknotify.log” is empty as well?
I’m not sure, but perhaps the version missmatch creates problems?
You stated, that you’re running on version 2.0.0p35, but your distributed site is version 2.0.0p5.
I didn’t read the werks-information, but perhaps there was a problem that was fixed?
Else i would recommend to increase your DEBUG-level for notifications in the global settings for all checkmk sites and further look into the logs.
I finally found the problem. I searched for notifications in the settings and I found a rule, notification for services. This was the culprit. It is very strange as this rule existed already for a long time and in the debugging output, I also didn’t see this rule popping up. So I still expect that something changed with an upgrade.
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.