Suppress multiple notification for flapping status after first notification

I receive multiple notifications for a single status change, because the status is flapping every 1 to 2 hours. The first notification is absolutely enough for me and it should be send on the first status change. So I can’t use “Delay service notifications”.

Instead I need something like “delay after first notification”, which I would set to 24 hours to receive only one notification per day. How can I reach this?

Hi @mgutt! You could use the following two options in a notification rule to achieve, what you are looking for:

1 Like

If my answer solved your question, do not forget to mark it as the solution, so others can pick it up quickly. :slight_smile:

@robin.gierse if I understand @mgutt correctly, they are not talking about periodic notifications, but actually new issues.
I.e.
Service goes to non-ok at 01:00 → first notification
Service becomes Ok again at 01:10 → Hard State OK → Notification count reset
Service stays OK long enough that the flapping detection cannot pick it up
Service goes to non-ok again at e.g. 08:00 → new first notification, as it is detected as a separate issue.

I guess in this case, the notification numbers from your example wouldn’t work at all, right?

I’m not sure if what @mgutt is looking for can be achieved within checkmk out of the box :frowning:
You would somehow need a notification layer that is stateful or has a memory of past notifications, so two rather dirty workarounds I can think of:
a) hack the notification plugin to do a quick lookup (e.g via livestatus) if there has been a notification for the issue in the past 24h OR
b) Instead of creating notifications directly by mail, use the notification method "forward notification to event console. In the EC you can merge open events for an interval of ie. 24 hours, then you create monitoring notification from the EC events


you might have to limit the event lifetime to 24h as well, so that the event is archived and a new event is created after 24hours

I think b) is way more transparent to other admins/users, but it still feels weird. Maybe someone has a better idea?

1 Like

You got me there @gstolz!
You are completely right, I misunderstood the issue at hand.
Your explanation is on the point, and I like the event console approach, although it still adds some complexity.
But what @mgutt asks, is just daily monitoring and has nothing to do with flapping states. He is looking for an overarching event management solution, I guess.

It’s kinda like the approach we are doing with generating tickets on monitoring events. We store open ticket numbers and do not generate another one on a service (except it changes from WARN to CRIT which results in criticality change) until the ticket is closed.
But this won’t work without a custom notification script which keeps track of open events.

How did you solve this? That would be even better than a 24h lock for new notifications.

We have developed an own notification scrip which handles the API calls to the ticket system and stores the ticket number as comment to the service/host. If a new event is coming up we check against a existing comment in the expected format and do nothing (except check the criticality).

2 Likes

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.