Number of failed notifications is growing and I can't stop it

CMK version: 2.2.0p14
OS version: Appliance 6.1.7

Hello,

I tried everything to stop “failed notification” but the number is still growing. At the Moment it counts 80459. I can’t delete it in the web GUI the button does not work.

I can’t sent email from the system so I deleted the email rule, we have a “service Now Plugin” rule but it is not ready at the moment so it is deactiveated.

I stopped the Notifcations in Master control
I clicked the “disable” button in the notification rule
I set 'Temporary disable notifications" for every user
But it just carries on creating ailed notifications.
How can I stop it ?

Hi @Gre,

can you please check for a user notification rule that might be the cause?

Otherwise, it would be helpful to get an example of a failed notification (without sensitive information).

Thanks in advance!
Norm

If this is the case no new notifications should be generated.
As a first step i would cleanup the folder “~/var/check_mk/notify” from old files.
If there is nothing inside the notification system will not try to sent anything new.
The showed failed notification itself comes from the core log.
I think you can only get rid of this with search and delete the core log lines with the content “Cannot send” inside.

Thank you,
I found a notification rule for a user causing the failed notifications.

Thanks for your help.
I deleted the files under var/check_mk/notify/deferred
But on the web GUI it still says “80778 failed notifications”
Where can I find the core Logfile you mentioned ?

Can I ask you a question? Why do I not have the folder /var/check_mk/notify ??

Is it possible that you executed your find not on a CMK server?

~/var/check_mk/core/archive → all core archive files

Inside these files you can look for lines with
“Cannot send”
inside. If these lines are removed and the core is restarted all the failed notifications are gone.
I don’t know if there is a really better way for this.

well its possible i didnt do it right but it should be the checkmk server

Did you Check the var directory in the omd Environment? It’s not in /var.

how do i do that? what is omd mean?

You have to change your user to the user named like your checkmk site.
su - “sitename”
then you find var in this directory
cd var

1 Like