Problem with run_cached/Linux agent after 2.1.0p35 update

CMK version:
Checkmk Raw Edition 2.1.0p35

OS version:
RHEL 8.8

Error message:

WARN ServernameA Check_MK [agent] Success, Missing monitoring data for plugins: chrony WARN, execution time 1.3 sec 2023-11-10 11:47:15 - 27.4 s
WARN ServernameB Check_MK [agent] Success, Missing monitoring data for plugins: chrony WARN, execution time 1.4 sec 2023-11-10 11:47:15 - 27.4 s
WARN ServernameC Check_MK [agent] Success, Missing monitoring data for plugins: chrony WARN, execution time 1.5 sec 2023-11-10 11:46:47 - 55.4 s
WARN ServernameX Check_MK [agent] Success, Missing monitoring data for plugins: chrony WARN, execution time 1.5 sec 2023-11-10 11:46:47 - 55.4 s
WARN ServernameY Check_MK [agent] Success, Missing monitoring data for plugins: chrony WARN, execution time 1.7 sec 2023-11-10 11:46:47 - 55.4 s
---
UNKN ServernameC OMD SiteName2 status Item not found in monitoring data 2023-11-27 09:31:21 - 14.9 s

Problem Description
Hey!

We’ve had strange issues with 2 inbuilt agent plugins (Chrony/NTP & OMD) intermittently not providing data to agent output. This issue started after updating from (2.1.0p25) to (2.1.0p35).

Due to the data disappearing intermittently from the agent output this issue has been hard to troubleshoot but our workaround below has worked so far:

#First problem:#
`run_cached "chrony" 30 "echo '<<<chrony>>>'; waitmax 5 chronyc -n tracking | cat || true" | sed 's/\(<<<chrony:cached(.*,\)30)>>>/\1120)>>>/'`
#Workaround 1:#
`echo '<<<chrony>>>'; waitmax 5 chronyc -n tracking | cat || true`
#Second problem:#
`run_cached "omd_status" 60 "echo '<<<omd_status>>>'; omd status --bare || true"`
#Workaround 2:#
`echo '<<<omd_status>>>'; omd status --bare || true`

Still unsure about the root cause of the issue but the culprit is the (run_cached) part.

Anybody have any help to give about this issue being a general issue worth reporting as a bug or this being something in relation to our local environment?

Hi.

Did you try to clean the cache with “cmk --flush ”. Looks like there are wrong data in cache.

Rg, Christian

1 Like

Thanks for the suggestion @ChristianM. Will try it out in combination with CheckMK 2.2 upgrade after the new year!

Hi,

after upgrading to 2.2.0p17 from 2.2.0p12, my dashboard is full with warnings.
I tried to clean the cache “cmk --flush” without success.

[agent] Success, [piggyback] Success (but no data found for this host), Missing monitoring data for plugins: chronyWARN, execution time 3.0 sec
Vanished services: 1 (chrony: 1)WARN, Host labels: all up to date

Vanished services: 1 (omd_status: 1)WARN, Host labels: all up to date

They are appearing and disappearing.

Seems to be the same problem.

Thanks.

@ChristianM Unfortunately same result after (cmk --flush) as for @peter-held.

Servers and Agent upgraded to (2.2.0p17) this week and still same issue. Had to re-implement workaround code (remove run_cached/_run_cached_internal) again.