Periodic service discovery - removing vanished services

5rG · December 13, 2019, 8:43am

Hi,

if I’m using Periodic service discovery rule and use the option to not just add but also remove vanished services than I think the services should be removed only if the Check_MK service is okay.

We had cases when monitored server was unreachable or agent did not work and for that reason all services were removed from monitoring. Ofcourse, when the agent was up again, the services were added back again but meanwhile (if i.e., the Check_MK Inventory is done every 12 hours) we did not have service monitoring.

It would be also nice to be able to select which type of service states are considered to be removed - warn, crit, unknown and also how many time should passed before they are removed.

I’m monitoring large network infrastructure and I want to remove all unknown interface services (beeing unconfigured on network devices) and all critical interface services (beeing down for more than one week).
I’m currently solving this problem with daily-running script where I parse autochecks.mk file and remove all unknown and critical services that are more than X day old.

Regards,
Peter

benjamin.alfery · March 2, 2021, 3:39pm

Hi,

I would be interested on how you solved this in a bit more detail. I’m currently looking for a solution to remove services (automatically) that are vanished for, let’s say, a week or so.

BR
Benjamin

foobar · October 6, 2021, 3:06pm

we’ve been looking for this since 3 years+ and discussed even some more trival cases to do it manually, but seems to have no priority

linux_frickler · October 11, 2021, 12:02pm

I think it got implemented in Werk 11001 in Checkmk 2.0:

(after we begged for years)
Not upgraded to 2.0 yet. Can anyone confirm that it works?

foobar · October 12, 2021, 8:56am

yes the WERK is implemented. Apparently not with time setting, how long a service needs to be unknown/vanished to finally remove it.

For instance, if you remove all checks in unknown state (and configure it to run every 24h) - you end up removing tons of services which are more or less “flapping” or also the ones actually been working on and maybe even acknowleged ones (haven’t seen how to prevent ackknowladged unknowns from removing)

Cheers

LaSoe · October 25, 2021, 5:14pm

As far as I can tell, the white-/blacklist works fine (at least for us). only the services that match the pattern are added / removed.

It should be noted here that only one Periodic service discovery Rule per device is executed. This means that if two or more rules overlap, only one of all the matching rules will be executed (first match).

If, for example, you have a general rule for all devices that adds services and a specific rule for removing vanished services on a specific group of devices the more general rules will not be executed on this specific group of devices. So in the end you never really know which rules will be executed on which devices.