Error message:
After a fresh installation of CheckMK RAR on RHEL 9 via dnf install ./check-mk-raw-2.1.0p27-el9-38.x86_64.rpm no graphs are shown for the services. I get this message “No historic metrics recorded but performance data is available. Maybe performance data processing is disabled.”
~/var/pnp4nagios/perfdata is empty.
Is set SELinux on the host to permissive.
rrdcached.log is 0 byte
in the npcd.log i found
[05-13-2023 00:45:58] NPCD: ERROR: Executed command exits with return code ‘2’
[05-13-2023 00:45:58] NPCD: ERROR: Command line was ‘/omd/sites/uni_oldenburg/lib/pnp4nagios/process_perfdata.pl -n -c /omd/sites/mysite/etc/pnp4nagios/process_perfdata.cfg -b /omd/sites/mysite/var/pnp4nagios/spool//perfdata.1683931445’
Output of “cmk --debug -vvn hostname”: (If it is a problem with checks or plugins)
=> Command not found
Can't locate lib.pm in @INC (you may need to install the lib module) (@INC contains: /omd/sites/mysite/local/lib/perl5/lib/perl5 /omd/sites/mysite/lib/perl5/lib/perl5/x86_64-linux-thread-multi /omd/sites/mysite/lib/perl5/lib/perl5 /usr/local/lib64/perl5/5.32 /usr/local/share/perl5/5.32 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at /omd/sites/mysite/lib/pnp4nagios/process_perfdata.pl line 20.
BEGIN failed--compilation aborted at /omd/sites/mysite/lib/pnp4nagios/process_perfdata.pl line 20.
I created an internal ticket, so this will be fixed in a future release.
As there is a valid workaround, I think we should be good.
Thanks for reporting this @florian.hanner and welcome to the community!
I had the RAW p27 in Oracle Linux 9 when I detected the issue (high cpu usage and graphs not being created), manual installation of perl-lib fixed the perfdata errors, graphs are getting created with punctual ERROR 7 appearing in logs, the only issue we see is the var/rrdcached/ directory getting full of rrd.journal.* files.
None of the following reconfigurations have worked for processing the “stale” journal files: A flush rrd via the socket, update to latest omd, fine tune the related config files -updating timeouts, load thresholds-, upgrade vcpus from 4 to 6 (120 hosts, 2400 checks).
This behavior only happens in the 2.x version, same configurations in 1.6.0p24.cre work as expected (graphs created, only a couple of rrd journal files) with quite lower CPU, memory and IO usage (in 2.x service check performance jumps from 2 to 33 seconds without a clear reason).
Strangly enough, restarting the omd.service via systemctl restart omd.service fixed the rddcached issue. Restarting only the instance did not process previous journal files nor generated graphs.