Setting a "Host Check Command" breaks page "Service graphs of host"!? ("No historic metrics recorded but performance data is available.")

gregor.hoffleit · February 7, 2024, 5:36pm

CMK version:
2.2.0p21 cme

OS version:
Debian 11

Description of the problem:
When setting a “Host Check Command” rule (for a host xyz), the service graphs on the page “Service graphs of host xyz” disappear.
Instead a message “No historic metrics recorded but performance data is available. Maybe performance data processing is disabled.” is shown.

For me, this happens always for the host check command “Use the status of the Checkmk Agent”.

For “Use the status of the service xyz”, it seems to depend on the service chosen as host check.
With “Always assume host to be up” there’s no problem.

The service graphs are fine on all pages. The problem is not with the graphs but this this page.

Steps to reproduce:

Set a rule “Host Check Command” = “Use the status of the Checkmk Agent” for a host.
Wait for the change to be executed
Visit the page “Host” / “Service graphs of host” → The graphs are gone.

joerg.mohar · February 8, 2024, 7:30am

Hello. Same behavior on Ubuntu20.04 and 2.0.0p33 (CEE). I asume this is a bug (??) not a feature as the servcie graphs are working fine.

Same when choosing Check_MK as Service:

gregor.hoffleit · February 8, 2024, 9:36am

Thanks @joerg.mohar for reproducing this behavior.

I hope that the Checkmk team picks up this bug report.

MarsellusWallace · February 8, 2024, 4:04pm

Hi @joerg.mohar and @gregor.hoffleit ,

Jörg, 2.0.0 is EOL since September 2023, we will not do anything on this anymore.
Regarding the situation you’re facing: what was the former host check command for those hosts?

Best regards,
Marsellus W.

MarsellusWallace · February 8, 2024, 4:18pm

Checked it myself, works like expected:

The performance data is available, as long as the chosen service in host check command provides perfdata, but it belongs to the service, not the host, so the message is correct…

gregor.hoffleit · February 9, 2024, 12:43pm

Thanks, @MarsellusWallace, for the fast reaction!

The performance data is available, as long as the chosen service in host check command provides perfdata, but it belongs to the service, not the host, so the message is correct…

I’m not sure we’re talking about the same thing here. Perhaps for the moment ignore the case with “Use the status of the service…” and let’s have a look at

no host check command
“Use the status of the Checkmk Agent”
“Always assume host to be up”

Regarding the situation you’re facing: what was the former host check command for those hosts?

Quick rationale beforehand: Those hosts are not accessible for Checkmk.

Therefore we implemented our own poor man’s push mode, running check_mk_agent as a cronjob, sftp pushing the output to the Checkmk server. Checkmk picks up the file with a simple datasource program. The datasource program sets an error code -1 or -2 if the file is too old or missing.

(1) no ‘host check command’:

“Service graphs for host …” works, shows the service graphs of the host
But: Host is always DOWN (since the smart pings don’t come through)

(2) host check command = “Use the status of the Checkmk Agent”:

Checkmk’s host status reasonably reflects the status of the host
But: “Service graphs of host …” show error messages instead of the graphs

(3) host check command = “Always assume host to be up”:

“Service graphs for host …” works, shows the service graphs of the host
But: Host is always UP (regardless of the real state)

I’m pretty sure that (2) worked when we were using Raw Edition with Nagios core.

MarsellusWallace · February 14, 2024, 3:44pm

Apologize for the late reply, simply too much work…

That’s something I always tell my team members: “Do read the details of tickets in detail and do not assume something.”

…now I tapped in exactly this trap: I only thought of the host’s perfgraph line (which shows up in the “status of host” page and shows the same message)…

I’ll test again using the “service graphs of host” page tomorrow and report back!

Dirk · February 15, 2024, 10:51am

I know, this isn’t the solution for your problem but maybe it is an alternative. We have a similar situation (poor man’s push) and use a custom check plugin as the host check command:

check_file_age -w 120 -c 180 -i -f /path/to/agent-output-of/$HOSTNAME$

The check check_file_age comes with checkmk and simply checks the age and existance of a given file. In our case it goes yellow if the file is older than 120 seconds and red after three minutes. It is located in ~/lib/nagios/plugins/check_file_age.

With this check the host graph works.

MarsellusWallace · February 15, 2024, 5:09pm

Hi guys,

I reproduced the issue with the “service graphs of host” page and opened an internal ticket with high prio!

Nevertheless my response regarding this message in the host perfgraph remains valid and the same: perfdata is available but belongs to a service.

That does not lower the issue with the failing page though.

MarsellusWallace · February 19, 2024, 10:09pm

Hi guys,

I just wanted to inform you that our developers work on this issue!

BR,
Marsellus W.

gregor.hoffleit · February 28, 2024, 2:05pm

Thanks @MarsellusWallace!

I just read that this problem was fixed with Werk 16049 (in 2.2.0p23).

Will confirm here whether the fix worked for me ;-).

gregor.hoffleit · February 29, 2024, 3:40pm

Werk 16049 in 2.2.0p23 fixed this problem for me!

Again, thanks @MarsellusWallace for your patience in reading my bug reports!