CMK version: 2.4.0p21.cme
OS version: RHEL 9
Error message:
Admittedly, I don’t remember if these checks were present before migrating from 2.3.0p38.cme to 2.4.0p21.cme so can’t tell if they were OK prior or just not detected/present. After the upgrade I did a bulk discovery to handle multiple plugin updates.
The issue highlighted here is the check for blade_powerfan is returning as Critical for Power Module Cooling device in an older IBM Chassis.
Output of “cmk --debug -vvn hostname”: (If it is a problem with checks or plugins)
Power Module Cooling Device 1 Speed: 59.00%, RPM: 5973.0, Controller state: not OK(!!)
Power Module Cooling Device 2 Speed: 59.00%, RPM: 5888.0, Controller state: not OK(!!)
Power Module Cooling Device 3 Speed: 59.00%, RPM: 5888.0, Controller state: not OK(!!)
Power Module Cooling Device 4 Speed: 60.00%, RPM: 5994.0, Controller state: not OK(!!)
Digging around, I located firmware with a MIB file and what I believe is the MIB/OID is for the ‘Controller state’ .
fanPackControllerState OBJECT-TYPE
SYNTAX INTEGER {
operational(0),
flashing(1),
notPresent(2),
communicationError(3),
unknown(255)
}
ACCESS read-only
STATUS mandatory
DESCRIPTION
"The health state for the controller for the fan pack.
0 = operational, 1 = flashing in progress, 2 = not present, 3 = communication error,
255 = unknown"
::= { fanPackEntry 7}
I can query it via snmpv3 and get what looks like an ‘operational’ response.
.1.3.6.1.4.1.2.3.51.2.2.6.1.1.7.1 0
.1.3.6.1.4.1.2.3.51.2.2.6.1.1.7.2 0
.1.3.6.1.4.1.2.3.51.2.2.6.1.1.7.3 0
.1.3.6.1.4.1.2.3.51.2.2.6.1.1.7.4 0
Since this is an inventoried check I can’t override the status so have instead disabled the blade_powerfan for several chassis in the short term.
Is this a check maintained by CheckMK or something from Nagios? Any thoughts on a way to correct it or should I leave it disabled?
Sincerely,
Scotsie
