CMK version: 2.4.0p24.cre / 2.4.0p22.cre
OS version: Debian 13
Hi,
our operators creates some availability reports for last month on the start of a new one. This time there was problem, because data was N/A for one whole day - over 3% of the period. The problematic is 2026-03-16. I have analyzed the problem for a while and finally found the problem. The timestamp of all the logs in ~/var/nagios/archive was at 23:59, only the log-file of preceding day has one second later:
OMD[nms@nms2]:~/var/nagios/archive$ ll nagios-03-*2026-00.log*
-rw-r--r-- 1 nms nms 4355906 Feb 28 23:59 nagios-03-01-2026-00.log
-rw-r--r-- 1 nms nms 4281898 Mar 1 23:59 nagios-03-02-2026-00.log
-rw-r--r-- 1 nms nms 4251195 Mar 2 23:59 nagios-03-03-2026-00.log
-rw-r--r-- 1 nms nms 4271704 Mar 3 23:59 nagios-03-04-2026-00.log
-rw-r--r-- 1 nms nms 4526051 Mar 4 23:59 nagios-03-05-2026-00.log
-rw-r--r-- 1 nms nms 4307700 Mar 5 23:59 nagios-03-06-2026-00.log
-rw-r--r-- 1 nms nms 4237518 Mar 6 23:59 nagios-03-07-2026-00.log
-rw-r--r-- 1 nms nms 4272722 Mar 7 23:59 nagios-03-08-2026-00.log
-rw-r--r-- 1 nms nms 4184375 Mar 8 23:59 nagios-03-09-2026-00.log
-rw-r--r-- 1 nms nms 4266795 Mar 9 23:59 nagios-03-10-2026-00.log
-rw-r--r-- 1 nms nms 8998907 Mar 10 23:59 nagios-03-11-2026-00.log
-rw-r--r-- 1 nms nms 4381790 Mar 11 23:59 nagios-03-12-2026-00.log
-rw-r--r-- 1 nms nms 6952642 Mar 12 23:59 nagios-03-13-2026-00.log
-rw-r--r-- 1 nms nms 4525390 Mar 13 23:59 nagios-03-14-2026-00.log
-rw-r--r-- 1 nms nms 4233513 Mar 14 23:59 nagios-03-15-2026-00.log
-rw-r--r-- 1 nms nms 4246126 Mar 16 00:00 nagios-03-16-2026-00.log
-rw-r--r-- 1 nms nms 4355023 Apr 2 12:59 nagios-03-17-2026-00.log
-rw-r--r-- 1 nms nms 4355023 Mar 16 23:59 nagios-03-17-2026-00.log.orig
-rw-r--r-- 1 nms nms 6771736 Mar 17 23:59 nagios-03-18-2026-00.log
-rw-r--r-- 1 nms nms 4322754 Mar 18 23:59 nagios-03-19-2026-00.log
-rw-r--r-- 1 nms nms 6706325 Mar 19 23:59 nagios-03-20-2026-00.log
-rw-r--r-- 1 nms nms 13974524 Mar 20 23:59 nagios-03-21-2026-00.log
-rw-r--r-- 1 nms nms 4453645 Mar 21 23:59 nagios-03-22-2026-00.log
-rw-r--r-- 1 nms nms 4491218 Mar 22 23:59 nagios-03-23-2026-00.log
-rw-r--r-- 1 nms nms 4490430 Mar 23 23:59 nagios-03-24-2026-00.log
-rw-r--r-- 1 nms nms 6671432 Mar 24 23:59 nagios-03-25-2026-00.log
-rw-r--r-- 1 nms nms 8977179 Mar 25 23:59 nagios-03-26-2026-00.log
-rw-r--r-- 1 nms nms 18229690 Mar 26 23:59 nagios-03-27-2026-00.log
-rw-r--r-- 1 nms nms 4654324 Mar 27 23:59 nagios-03-28-2026-00.log
-rw-r--r-- 1 nms nms 4374749 Mar 28 23:59 nagios-03-29-2026-00.log
-rw-r--r-- 1 nms nms 4408760 Mar 29 23:59 nagios-03-30-2026-00.log
-rw-r--r-- 1 nms nms 13603804 Mar 30 23:59 nagios-03-31-2026-00.log
There was lines with non-monotonic timestamp at the start of the file nagios-03-17-2026-00.log. I fixed the timestamp of several lines at start and availability report works as expected. There is the correction detail:
OMD[nms@nms2]:~/var/nagios/archive$ diff -u nagios-03-17-2026-00.log.orig nagios-03-17-2026-00.log | sed -r 's/ALERT: [^;]+/ALERT: <censored>/;'
--- nagios-03-17-2026-00.log.orig 2026-03-16 23:59:39.762350836 +0100
+++ nagios-03-17-2026-00.log 2026-04-02 12:59:42.374705460 +0200
@@ -1,19 +1,19 @@
[1773615600] LOG ROTATION: DAILY
-[1773615601] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
-[1773615601] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
-[1773615601] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
-[1773615601] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
-[1773615601] SERVICE DOWNTIME ALERT: <censored>;Voltage PCH Vin;STARTED; Service has entered a period of scheduled downtime
-[1773615601] SERVICE DOWNTIME ALERT: <censored>;Service Snare;STARTED; Service has entered a period of scheduled downtime
-[1773615601] SERVICE DOWNTIME ALERT: <censored>;Sensor A Humidity;STARTED; Service has entered a period of scheduled downtime
-[1773615601] SERVICE DOWNTIME ALERT: <censored>;Interface Adaptive Security Appliance Internal-Data0/0 interface;STARTED; Service has entered a period of scheduled downtime
-[1773615601] TIMEPERIOD TRANSITION: 12x5;0;0
-[1773615601] TIMEPERIOD TRANSITION: 15x5;0;0
-[1773615601] TIMEPERIOD TRANSITION: 15x7;0;0
-[1773615601] TIMEPERIOD TRANSITION: 24X7;1;1
-[1773615601] TIMEPERIOD TRANSITION: 24x7;1;1
-[1773615601] TIMEPERIOD TRANSITION: 8x5;0;0
-[1773615601] logging initial states
+[1773615600] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
+[1773615600] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
+[1773615600] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
+[1773615600] HOST DOWNTIME ALERT: <censored>;STARTED; Host has entered a period of scheduled downtime
+[1773615600] SERVICE DOWNTIME ALERT: <censored>;Voltage PCH Vin;STARTED; Service has entered a period of scheduled downtime
+[1773615600] SERVICE DOWNTIME ALERT: <censored>;Service Snare;STARTED; Service has entered a period of scheduled downtime
+[1773615600] SERVICE DOWNTIME ALERT: <censored>;Sensor A Humidity;STARTED; Service has entered a period of scheduled downtime
+[1773615600] SERVICE DOWNTIME ALERT: <censored>;Interface Adaptive Security Appliance Internal-Data0/0 interface;STARTED; Service has entered a period of scheduled downtime
+[1773615600] TIMEPERIOD TRANSITION: 12x5;0;0
+[1773615600] TIMEPERIOD TRANSITION: 15x5;0;0
+[1773615600] TIMEPERIOD TRANSITION: 15x7;0;0
+[1773615600] TIMEPERIOD TRANSITION: 24X7;1;1
+[1773615600] TIMEPERIOD TRANSITION: 24x7;1;1
+[1773615600] TIMEPERIOD TRANSITION: 8x5;0;0
+[1773615600] logging initial states
[1773615600] LOG VERSION: 2.0
[1773615600] CURRENT HOST STATE: 1102controller.i.cz;UP;HARD;1;OK - 10.0.164.2 rta 0.341ms lost 0%
I’m not certain if the problem is in the Nagios, that generated non-monotonic records or CheckMK not capable of accepting this Nagios anomaly.