Bug Nagios logrotation or CheckMK availability report bug?

CMK version: 2.4.0p24.cre / 2.4.0p22.cre
OS version: Debian 13

Hi,
our operators creates some availability reports for last month on the start of a new one. This time there was problem, because data was N/A for one whole day - over 3% of the period. The problematic is 2026-03-16. I have analyzed the problem for a while and finally found the problem. The timestamp of all the logs in ~/var/nagios/archive was at 23:59, only the log-file of preceding day has one second later:

OMD[nms@nms2]:~/var/nagios/archive$ ll nagios-03-*2026-00.log*
-rw-r--r-- 1 nms nms  4355906 Feb 28 23:59 nagios-03-01-2026-00.log
-rw-r--r-- 1 nms nms  4281898 Mar  1 23:59 nagios-03-02-2026-00.log
-rw-r--r-- 1 nms nms  4251195 Mar  2 23:59 nagios-03-03-2026-00.log
-rw-r--r-- 1 nms nms  4271704 Mar  3 23:59 nagios-03-04-2026-00.log
-rw-r--r-- 1 nms nms  4526051 Mar  4 23:59 nagios-03-05-2026-00.log
-rw-r--r-- 1 nms nms  4307700 Mar  5 23:59 nagios-03-06-2026-00.log
-rw-r--r-- 1 nms nms  4237518 Mar  6 23:59 nagios-03-07-2026-00.log
-rw-r--r-- 1 nms nms  4272722 Mar  7 23:59 nagios-03-08-2026-00.log
-rw-r--r-- 1 nms nms  4184375 Mar  8 23:59 nagios-03-09-2026-00.log
-rw-r--r-- 1 nms nms  4266795 Mar  9 23:59 nagios-03-10-2026-00.log
-rw-r--r-- 1 nms nms  8998907 Mar 10 23:59 nagios-03-11-2026-00.log
-rw-r--r-- 1 nms nms  4381790 Mar 11 23:59 nagios-03-12-2026-00.log
-rw-r--r-- 1 nms nms  6952642 Mar 12 23:59 nagios-03-13-2026-00.log
-rw-r--r-- 1 nms nms  4525390 Mar 13 23:59 nagios-03-14-2026-00.log
-rw-r--r-- 1 nms nms  4233513 Mar 14 23:59 nagios-03-15-2026-00.log
-rw-r--r-- 1 nms nms  4246126 Mar 16 00:00 nagios-03-16-2026-00.log
-rw-r--r-- 1 nms nms  4355023 Apr  2 12:59 nagios-03-17-2026-00.log
-rw-r--r-- 1 nms nms  4355023 Mar 16 23:59 nagios-03-17-2026-00.log.orig
-rw-r--r-- 1 nms nms  6771736 Mar 17 23:59 nagios-03-18-2026-00.log
-rw-r--r-- 1 nms nms  4322754 Mar 18 23:59 nagios-03-19-2026-00.log
-rw-r--r-- 1 nms nms  6706325 Mar 19 23:59 nagios-03-20-2026-00.log
-rw-r--r-- 1 nms nms 13974524 Mar 20 23:59 nagios-03-21-2026-00.log
-rw-r--r-- 1 nms nms  4453645 Mar 21 23:59 nagios-03-22-2026-00.log
-rw-r--r-- 1 nms nms  4491218 Mar 22 23:59 nagios-03-23-2026-00.log
-rw-r--r-- 1 nms nms  4490430 Mar 23 23:59 nagios-03-24-2026-00.log
-rw-r--r-- 1 nms nms  6671432 Mar 24 23:59 nagios-03-25-2026-00.log
-rw-r--r-- 1 nms nms  8977179 Mar 25 23:59 nagios-03-26-2026-00.log
-rw-r--r-- 1 nms nms 18229690 Mar 26 23:59 nagios-03-27-2026-00.log
-rw-r--r-- 1 nms nms  4654324 Mar 27 23:59 nagios-03-28-2026-00.log
-rw-r--r-- 1 nms nms  4374749 Mar 28 23:59 nagios-03-29-2026-00.log
-rw-r--r-- 1 nms nms  4408760 Mar 29 23:59 nagios-03-30-2026-00.log
-rw-r--r-- 1 nms nms 13603804 Mar 30 23:59 nagios-03-31-2026-00.log

There was lines with non-monotonic timestamp at the start of the file nagios-03-17-2026-00.log. I fixed the timestamp of several lines at start and availability report works as expected. There is the correction detail:

OMD[nms@nms2]:~/var/nagios/archive$ diff -u nagios-03-17-2026-00.log.orig nagios-03-17-2026-00.log | sed -r 's/ALERT: [^;]+/ALERT: <censored>/;'
--- nagios-03-17-2026-00.log.orig       2026-03-16 23:59:39.762350836 +0100
+++ nagios-03-17-2026-00.log    2026-04-02 12:59:42.374705460 +0200
@@ -1,19 +1,19 @@
 [1773615600] LOG ROTATION: DAILY
-[1773615601] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
-[1773615601] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
-[1773615601] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
-[1773615601] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
-[1773615601] SERVICE DOWNTIME ALERT: <censored>;Voltage PCH Vin;STARTED; Service has entered a period of scheduled downtime
-[1773615601] SERVICE DOWNTIME ALERT: <censored>;Service Snare;STARTED; Service has entered a period of scheduled downtime
-[1773615601] SERVICE DOWNTIME ALERT: <censored>;Sensor A Humidity;STARTED; Service has entered a period of scheduled downtime
-[1773615601] SERVICE DOWNTIME ALERT: <censored>;Interface Adaptive Security Appliance Internal-Data0/0 interface;STARTED; Service has entered a period of scheduled downtime
-[1773615601] TIMEPERIOD TRANSITION: 12x5;0;0
-[1773615601] TIMEPERIOD TRANSITION: 15x5;0;0
-[1773615601] TIMEPERIOD TRANSITION: 15x7;0;0
-[1773615601] TIMEPERIOD TRANSITION: 24X7;1;1
-[1773615601] TIMEPERIOD TRANSITION: 24x7;1;1
-[1773615601] TIMEPERIOD TRANSITION: 8x5;0;0
-[1773615601] logging initial states
+[1773615600] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
+[1773615600] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
+[1773615600] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
+[1773615600] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
+[1773615600] SERVICE DOWNTIME ALERT: <censored>;Voltage PCH Vin;STARTED; Service has entered a period of scheduled downtime
+[1773615600] SERVICE DOWNTIME ALERT: <censored>;Service Snare;STARTED; Service has entered a period of scheduled downtime
+[1773615600] SERVICE DOWNTIME ALERT: <censored>;Sensor A Humidity;STARTED; Service has entered a period of scheduled downtime
+[1773615600] SERVICE DOWNTIME ALERT: <censored>;Interface Adaptive Security Appliance Internal-Data0/0 interface;STARTED; Service has entered a period of scheduled downtime
+[1773615600] TIMEPERIOD TRANSITION: 12x5;0;0
+[1773615600] TIMEPERIOD TRANSITION: 15x5;0;0
+[1773615600] TIMEPERIOD TRANSITION: 15x7;0;0
+[1773615600] TIMEPERIOD TRANSITION: 24X7;1;1
+[1773615600] TIMEPERIOD TRANSITION: 24x7;1;1
+[1773615600] TIMEPERIOD TRANSITION: 8x5;0;0
+[1773615600] logging initial states
 [1773615600] LOG VERSION: 2.0
 [1773615600] CURRENT HOST STATE: 1102controller.i.cz;UP;HARD;1;OK - 10.0.164.2 rta 0.341ms lost 0%

I’m not certain if the problem is in the Nagios, that generated non-monotonic records or CheckMK not capable of accepting this Nagios anomaly.

1 Like