Storage/Disk Monitoring of HPe Gen 10 Server ILO 5 gone missing

CMK version: 2.1.0p34
OS version: RHEL 7

Error message: Disk monitoring no longer available - no error message, but services are showing as vanished

On a DL380 Gen 10 with ILO 5 and now version 3.0.0 of the ILO firmware we seem to have lost monitoring of the disk status. This isn’t via the host OS, this is directly querying via SNMP to the ILO.

After upgrading to ILO firmware to version 3.0.0 the SNMP monitoring of the Disk status seems to have vanished. Is this an incompatibility with CheckMK that anyone knows of?

HPE Integrated Lights-Out 5 Disk or Storage monitoring issue not working missing

After some further investigation we have ruled out CheckMK as the issue, thank heavens! Hopefully this can save someone some time!

The issue appears to be caused by changes that were made within the Proliant Gen 10 and Gen 10+ 3.0.0 ILO 5 firmware. This seems to have removed the OIDs that relate to the disks. Presumably as part of “adding” some support for more Gen10+ storage cards that are mentioned in the firmware notes (ILO Firmware).

From the release notes:

  • SNMP GET, GET-NEXT and WALK support added for the following storage controllers on iLO5
    • HPE MR216i-a Gen10 Plus
    • HPE MR216i-p Gen10 Plus
    • HPE MR416i-a Gen10 Plus
    • HPE MR416i-p Gen10 Plus
    • HPE SR932i-p Gen10 Plus
    • HPE SR416i-a Gen10 Plus
    • HPE NS204i-p Gen10 Plus Boot Controller

Version 2.9.9 seems to work correctly and upon downgrading the monitoring came back again. This appears to be a bug in the ILO firmware.

The OIDs being requested are in the branch:

.1.3.6.1.4.1.232.3.2.5.

It could be seen that there were failing OIDs by using

cmk -Ivvvv <hostname>

Which showed that the “232” ranges were failing to query correctly.

Upon comparing a version 3.0.0 to a version 2.9.9 box and searching for “Bay” in the output, it can be observed that it is not there.

v 2.9.9 - Data present

OMD[sitename]:~$ snmpwalk -v3 -l authPriv -u <username> -a MD5 -A "<xx>" -x AES -X "<yy>" <hostname1> -On -Ci 1.3.6.1.4.1.232.3.2.5 | grep "Bay"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.0 = STRING: "Port=1I:Box=3:Bay=4"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.1 = STRING: "Port=1I:Box=3:Bay=3"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.2 = STRING: "Port=1I:Box=3:Bay=2"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.3 = STRING: "Port=1I:Box=3:Bay=1"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.4 = STRING: "Port=2I:Box=3:Bay=5"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.5 = STRING: "Port=2I:Box=3:Bay=6"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.6 = STRING: "Port=2I:Box=3:Bay=7"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.7 = STRING: "Port=2I:Box=3:Bay=8"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.10 = STRING: "Port=3I:Box=2:Bay=2"
.1.3.6.1.4.1.232.3.2.5.1.1.64.0.11 = STRING: "Port=3I:Box=2:Bay=1

"

v 3.0.0 - Data Missing

OMD[sitename]:~$ snmpwalk -v3 -l authPriv -u <username> -a MD5 -A "<xx>" -x AES -X "<yy>" hostname2 -On -Ci 1.3.6.1.4.1.232.3.2.5
.1.3.6.1.4.1.232.3.2.5 = No Such Object available on this agent at this OID