It’s the check “temperature zone 0” from the checkmk server itself.
It’s a HP Proliant DL360 Gen10
Intel Xeon Silver 4110 CPU @ 2.10GHz
32 GB RAM
Strangely right after updating, the temperature rised to crit. I restarted the server but shortly after 5-10 minutes, temperature was right back at crit state. Not sure if should go back to version 2.0.0.12p.
I expect the problem is not the sensor, but maybe the cpu load that increased and so the temperature does. Have you checked the graph of the cpu load. Did it increase after the update?
Please ignore this check on hardware like the HP Proliant it is useless.
For better results use the values provided from the iLO interface.
Only this values are relevant to detect overheating or anything like a hardware failure.
I’ve seen similar behaviour some time ago. As far as I remember, something seemed to be wrong with the rrd update and was restarting processes and gererating log entries multiple times per second.
Try to drill down to the culprit process and have a look for logs that grow more quickly than normal.
Sorry for being so vague, but since it was just a test site back then, I just trashed it and didn’t investigate any further.
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.