Event Console rule pack rule "Limit event lifetime" not (exactly) expiring events?

Apologies for another newbie question.

I have an Event Console rule (looking for the text “Information (.*)” which previously had 11 hits (all from yesterday. I adjust the “Limit event lifetime” of the rule to expire events of that type over eight hours:

image

And the hits disappeared from the rule pack menu:

They did not, however, subtract from this number:

image

And they did not disappear from here:

Does something else handle that?

Thank you.

CMK version: 2.2.0(p24) raw
OS version: Ubuntu 22.04

Error message:

Output of “cmk --debug -vvn hostname”: (If it is a problem with checks or plugins)

Eventually I found my way to:

Setup → Event Console → Settings - Limit amount of current events

I set a Host limit of “100” current events with “Delete oldest event, create new event” and that seems to keep me cropped to 101 events. (I am still not sure what the rules I set actually do.)

I have two questions, if anyone knows of the top of their head:

First, does that legitimately delete these or are they hidden or retained somewhere and still taking up space?

Second, when this particular setting hits 100 it goes into a WARN state for “OMD checkmktest Event Console”… since cropping these off is exactly what I want it to do, I consider the state of it to be “OK” so is it possible for me to change checkmk to see “Current events: 101… Event limit active for 1 hosts” to an OK thing and not a WARN?

Thanks.

Ron

Only two short remarks from my side.

  • the modified rule will only apply the expire value to events created after the rule change
  • to the warning for the limit rule - this means you missed potential events as the system is forced to delete some, that is also the intended usage of this rule not to “expire” old events

Inside the event view you don’t see the events with time to expire.
Inside the event status file (~/var/mkeventd/status) you can see a “live_until” value for events with expire time.

with expire

{
    "facility": 1,
    "priority": 5,
    "text": "Still nothing happened.",
    "host": "myhost089",
    "ipaddress": "1.2.3.4",
    "application": "Foobar-Daemon",
    "pid": 0,
    "time": 1713291498.808372,
    "core_host": None,
    "host_in_downtime": False,
    "rule_id": "ER001",
    "contact_groups": None,
    "contact_groups_notify": False,
    "contact_groups_precedence": "host",
    "match_groups": (),
    "match_groups_syslog_application": (),
    "state": 0,
    "sl": 0,
    "first": 1713291498.808372,
    "last": 1713291498.808372,
    "phase": "open",
    "id": 4,
    "live_until": 1713295099.2209835,
    "live_until_phases": ["open", "ack"],
}

without expire

{
    "facility": 1,
    "priority": 5,
    "text": "Still nothing happened.111111",
    "host": "myhost1",
    "ipaddress": "1.2.3.5",
    "application": "Foobar-Daemon1234",
    "pid": 0,
    "time": 1713291866.260612,
    "core_host": None,
    "host_in_downtime": False,
    "rule_id": "ER001",
    "contact_groups": None,
    "contact_groups_notify": False,
    "contact_groups_precedence": "host",
    "match_groups": (),
    "match_groups_syslog_application": (),
    "state": 0,
    "sl": 0,
    "first": 1713291866.260612,
    "last": 1713291866.260612,
    "phase": "open",
    "id": 5,
}

Hi.

Ah, I see now, thank you. Rules are (and surely were all along) working correctly.

When these logs are “expired” are they genuinely deleted? (I ask because there are warnings in the checkmk documentation and videos about Event Console specifically not being a good place to have a huge archive of logs).

Also, if I am looking at an event log I have the option to “Archive” the event. Is that the same as “delete” or does that linger somewhere also? (There seems not to be another manual delete option, in spite of the warnings.)

For the record, as an experiment, I changed my rules to set my “Limit event lifetime” to 20 minutes and my Event Console Settings “Host limit” to 1 event, then rebooted the client to generate some information events (which cut me down to 2 events, both with epoch times 20 minutes away). (I then reset my host limit to 1000 logs, lest some other event be generated in the meantime and spoil the experiment).

The 20 minute mark came and a minute after that the new events disappeared. I must have been overwhelmed by the number of events I had in there before and could not see that it was working. Limiting these should keep me from ever worrying about the WARN/CRIT or auto-deleting over X number of messages etc.

Incidentally, how would I get that nice output from “status”? (cat, vi, and nano all give me a gigantic block of text)?

Ron

PS - For anyone looking at this in the future (including me), “~/var/” in this case means “/omd/sites/[your-checkmksite]/var/” (etc).

That’s correct, the amount of open events should not be to hight.

That is the same. A expired event goes to the same location (archive file) as the archived event.

The huge block of text is a Python dictionary. I only formatted it with my formatter of choice (black).

Thank you so much, that sorted me out.

For anyone finding this in the future (including myself) the “archive file” referenced here is the logs here:

/omd/sites/your-site/mkevent/history/

Ron