Syslog Kernel messages generate (too) many lines in Event Console

Hi,
I am using CRE 1.6.0p9 with the Event Console.

All my linux boxes send their syslog messages to my check_mk server(s).
I have two nameservers that regularily flood my event console with kern.warning messages.

For a better overview I want to have these ocurrences in one event only.
Either re-configuring my linux boxes or creating a new rule in my rulepacks will be fine.

How can I achieve this? (Aggregation of syslog lines of the same host, facility and severity in the same second(s) in one check_mk event)

Event Console --> your rule --> COUNTING & TIMING
image
With “Time period” you can define the interval for the counting and the three check boxes should be selected to make single events for different applications and hosts.

No, that’s not what I meant.

For example I have these kernel messages:

Apr 17 12:05:39 my-crashy-trashy-host kernel: kworker/u2:2: page allocation failure: order:0, mode:0x2284020(GFP_ATOMIC|__GFP_COMP|__GFP_NOTRACK)
Apr 17 12:05:39 my-crashy-trashy-host kernel: CPU: 0 PID: 28720 Comm: kworker/u2:2 Not tainted 4.9.0-12-amd64 #1 Debian 4.9.210-1
Apr 17 12:05:39 my-crashy-trashy-host kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Apr 17 12:05:39 my-crashy-trashy-host kernel: Workqueue: writeback wb_workfn (flush-253:0)
Apr 17 12:05:39 my-crashy-trashy-host kernel:  0000000000000000 ffffffffb0f36bfe ffffffffb1602988 ffffb826c31cf6d0
Apr 17 12:05:39 my-crashy-trashy-host kernel:  ffffffffb0d8dfba 02284020cffd7490 ffffffffb1602988 ffffb826c31cf670
Apr 17 12:05:39 my-crashy-trashy-host kernel:  ffff982200000010 ffffb826c31cf6e0 ffffb826c31cf690 004bacff1da141bf
Apr 17 12:05:39 my-crashy-trashy-host kernel: Call Trace:
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0f36bfe>] ? dump_stack+0x66/0x88
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0d8dfba>] ? warn_alloc+0x13a/0x160
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0d8e29b>] ? __alloc_pages_slowpath+0x24b/0xb30
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0d8ed81>] ? __alloc_pages_nodemask+0x201/0x260
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0de8f4a>] ? cache_grow_begin+0x9a/0x560
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0de8f4a>] ? cache_grow_begin+0x9a/0x560
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0de96c1>] ? fallback_alloc+0x161/0x200
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffc0262da9>] ? alloc_indirect.isra.14+0x19/0x50 [virtio_ring]
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0dea578>] ? __kmalloc+0x1e8/0x580
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffc0262da9>] ? alloc_indirect.isra.14+0x19/0x50 [virtio_ring]
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffc0262f86>] ? virtqueue_add_sgs+0x1a6/0x4b0 [virtio_ring]
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffc01eb6c7>] ? __virtblk_add_req+0xb7/0x270 [virtio_blk]
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0f1090c>] ? blk_rq_map_sg+0x22c/0x570
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffc01eb9a6>] ? virtio_queue_rq+0x126/0x270 [virtio_blk]
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0f14ca8>] ? __blk_mq_run_hw_queue+0x288/0x3e0
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0f14a0e>] ? blk_mq_run_hw_queue+0x6e/0x80
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0f16889>] ? blk_mq_flush_plug_list+0x139/0x160
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0f0b923>] ? blk_flush_plug_list+0xc3/0x230
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0d93a9d>] ? wb_update_bandwidth+0x4d/0x70
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0f0be87>] ? blk_finish_plug+0x27/0x40
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0e3e077>] ? wb_writeback+0x197/0x310
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb121e2f1>] ? __switch_to_asm+0x41/0x70
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0e3e8b8>] ? wb_workfn+0xa8/0x380
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb121e2f1>] ? __switch_to_asm+0x41/0x70
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb121e2e5>] ? __switch_to_asm+0x35/0x70
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0c9582a>] ? process_one_work+0x18a/0x430
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0c95b1d>] ? worker_thread+0x4d/0x490
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0c95ad0>] ? process_one_work+0x430/0x430
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0c9bdc9>] ? kthread+0xd9/0xf0
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb121e2f1>] ? __switch_to_asm+0x41/0x70
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb0c9bcf0>] ? kthread_park+0x60/0x60
Apr 17 12:05:39 my-crashy-trashy-host kernel:  [<ffffffffb121e377>] ? ret_from_fork+0x57/0x70
Apr 17 12:05:39 my-crashy-trashy-host kernel: Mem-Info:
Apr 17 12:05:39 my-crashy-trashy-host kernel: active_anon:12351 inactive_anon:13287 isolated_anon:0
                                                               active_file:10612 inactive_file:14062 isolated_file:32
                                                               unevictable:3 dirty:1 writeback:3388 unstable:0
                                                               slab_reclaimable:3223 slab_unreclaimable:2753
                                                               mapped:3288 shmem:420 pagetables:592 bounce:0
                                                               free:1013 free_pcp:13 free_cma:0

and I want them to be in one event only. So that I can properly read them. My notifications are configured so that every new event in the event console also creates an alarm message in Telegram and it’s very hard to tell what the problem was when you have to scroll through these many lines.

One of the things I love about CheckMk is that there can be many ways to do things.

Maybe instead of rifling all messages to the monitor (as events), perhaps you could do something agent based (possible for you?) possibly even something more custom that you write to achieve what you’re wanting.

Sure, it may mean writing a script or something or maybe a combination of things… maybe switch to logwatch, etc.

If you do need to continue sending all messages maybe you can filter out the events somewhat. At least keep the noise down to Telegram, etc.

Hm… I think before I put that much more effort in it (really wrote some sick regexes for many applications), I will rather switch to over to logstash/graylog and don’t use the event console anymore.

It’s up to you of course. But in my opinion, I’d learn a bit more about how you might do things with CheckMk before adding more monitoring islands. YMMV.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.