Critical Hardware Sensors VmWare

CMK version:1.6.0.p9
OS version:VMware ESXi, 7.0.2,

Error message:

Output of “cmk --debug -vvn hostname”:

Hardware Sensors     CRIT - Disk 10 on HPSA1 : Port  Box 0 Bay 19 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 11 on HPSA1 : Port  Box 0 Bay 26 : 0GB : Unconfigured Disk         : Disk Error: Red (The physical element is failing)(!!), Disk 12 on HPSA1 : Port  Box 0 Bay 29 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 13 on HPSA1 : Port  Box 0 Ba        y 35 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 14 on HPSA1 : Port  Box 0 Bay 38 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Dis        k 15 on HPSA1 : Port  Box 0 Bay 41 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 16 on HPSA1 : Port  Box 0 Bay 42 : 0GB : Unconfigured Disk : Disk Error: Red (The physica        l element is failing)(!!), Disk 17 on HPSA1 : Port  Box 0 Bay 43 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 18 on HPSA1 : Port  Box 0 Bay 46 : 0GB : Unconfigured Disk         : Disk Error: Red (The physical element is failing)(!!), Disk 19 on HPSA1 : Port  Box 0 Bay 49 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 2 on HPSA1 : Port 1I Box 1 Ba        y 101 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 20 on HPSA1 : Port  Box 0 Bay 50 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Di        sk 21 on HPSA1 : Port  Box 0 Bay 51 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 22 on HPSA1 : Port  Box 0 Bay 53 : 0GB : Unconfigured Disk : Disk Error: Red (The physic        al element is failing)(!!), Disk 23 on HPSA1 : Port  Box 0 Bay 58 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 24 on HPSA1 : Port  Box 0 Bay 59 : 0GB : Unconfigured Disk         : Disk Error: Red (The physical element is failing)(!!), Disk 25 on HPSA1 : Port  Box 0 Bay 61 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 26 on HPSA1 : Port  Box 0 Ba        y 64 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 27 on HPSA1 : Port  Box 0 Bay 65 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Dis        k 28 on HPSA1 : Port  Box 0 Bay 66 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 29 on HPSA1 : Port  Box 0 Bay 67 : 0GB : Unconfigured Disk : Disk Error: Red (The physica        l element is failing)(!!), Disk 30 on HPSA1 : Port  Box 0 Bay 74 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 31 on HPSA1 : Port  Box 0 Bay 81 : 0GB : Unconfigured Disk         : Disk Error: Red (The physical element is failing)(!!), Disk 32 on HPSA1 : Port  Box 0 Bay 83 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 33 on HPSA1 : Port  Box 0 Bay         86 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 34 on HPSA1 : Port  Box 0 Bay 89 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk         35 on HPSA1 : Port  Box 0 Bay 90 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 36 on HPSA1 : Port  Box 0 Bay 97 : 0GB : Unconfigured Disk : Disk Error: Red (The physical         element is failing)(!!), Disk 37 on HPSA1 : Port  Box 0 Bay 98 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 38 on HPSA1 : Port  Box 0 Bay 103 : 0GB : Unconfigured Disk         : Disk Error: Red (The physical element is failing)(!!), Disk 39 on HPSA1 : Port  Box 0 Bay 105 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 4 on HPSA1 : Port  Box 0 Bay         9 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 40 on HPSA1 : Port  Box 0 Bay 106 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk         41 on HPSA1 : Port  Box 0 Bay 109 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 5 on HPSA1 : Port  Box 0 Bay 10 : 0GB : Unconfigured Disk : Disk Error: Red (The physical         element is failing)(!!), Disk 6 on HPSA1 : Port  Box 0 Bay 11 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 7 on HPSA1 : Port  Box 0 Bay 15 : 0GB : Unconfigured Disk : D        isk Error: Red (The physical element is failing)(!!), Disk 8 on HPSA1 : Port  Box 0 Bay 17 : 0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!), Disk 9 on HPSA1 : Port  Box 0 Bay 18 :         0GB : Unconfigured Disk : Disk Error: Red (The physical element is failing)(!!),

Hello.
I wanted to know how I can remove an alarm that no longer exists in the destination site. This is happening to me in a service called hardware sensors in a vmware hypervisor.
In this service I am receiving a critical message that the hypervisor does not have.

In the description of the service in the checkmk. It advises me to go to the url that is specified.

This check checks the state of all of a VMWare ESX host system’s hardware sensors - including temperatures, fans, power supplies, memory DIMMS, hard disk, an others. In order to avoid network traffic the agent sends only information about sensors that are not in green state.
Note: Due to a caching problem on the ESX host system side, this check occasionally reports incorrect sensor data. This may mean that the sensor appears to be stuck in an unhealthy state. You can find more information here: VMware Knowledge Base

I have done what it says in the url and the error continues to appear.
Do you know how I can remove it? or how to fix it?

Thank you.

1 Like

Hello,

You may check with following command if you have sensors which are not in “green” state:

Get-VMHost myesxhost | % {(get-view $_.id).runtime.healthSystemRuntime.systemHealthInfo.NumericSensorInfo | Where-Object {$_.HealthState.Label -notmatch 'Green'}}

First try following command:

(Get-View (Get-VMHost -Name esxhostname | Get-View).ConfigManager.HealthStatusSystem).RefreshHealthStatusSystem()

If it doesnt work try that one to reset IPMI sensors:

(Get-View (Get-VMHost -Name esxhostname | Get-View).ConfigManager.HealthStatusSystem).ResetSystemHealthInfo()

In addtion you may try to clear event log on ESX hosts:

Hello.
Thanks for the answer . I have executed the commands and cleaned the logs and it does not solve it. The alerts keep coming out.

1 Like

Hello Community

I have the same issue on 3 HPE G10+ Servers.
Also, I have seen it before and we uninstalled the HPE Tools on the servers. (RHEL8)
But here at these servers, it seems to be completely different. We are running an ESX Cluster on the servers, and I can’t disable or uninstall HPE Tolls.

This is a known HPE firmware bug. You have no real solution for this other than hope that HPE will fix it.

2 Likes

Hi Andreas
thanks for the hint with the Firmware bug.
What would be the best way to get the sensor back to green?

This is the HPE Customer Adivsory for that problem.

https://support.hpe.com/hpesc/public/docDisplay?docId=emr_na-a00117054en_us

Solutions depend on the Server Generation (9/10) but the described solution for gen 10 (uninstalling smux providers) has not fixed the bug for me on several servers but led to no output at all.
I disabled that check and only use the ILO Hardware Monitoring until HPE Releases new P Controller Firmware…

1 Like

Thanks for your help :+1:

1 Like

FYI

I opened a support ticket at HPE for that problem and they told us that they are not going to fix the problem and we should use the ilo for hardware related monitoring.

4 Likes

Thanks @aeckstein. Still what a s*it answer from HPE…

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.