Check_MK Discovery Problem on HPE Server

CMK version:2.1.0p9 cee
OS version: Red Hat 3.10.0.

Error message:no unmonitored services found, 103 vanished services + SystemError(‘ returned a result with an error set’)

Hi there, we have a Problem with a HPE Apollo Server and SNMP on the iLO.
In the Check_MK Discovery Check we see following:

no unmonitored services found, 103 vanished services (hp_proliant_cpu:1, hp_proliant_da_cntlr:2, hp_proliant_da_phydrv:28, hp_proliant_fans:5, hp_proliant_mem:2, hp_proliant_power:1, hp_proliant_psu:3, hp_proliant_raid:2, hp_proliant_temp:38, hp_sts_drvbox:2, interfaces:17, snmp_info:1, uptime:1), no new host labels, [snmp] SystemError(‘ returned a result with an error set’)CRIT

and all 103 Services are vanished.

But on the overview of the Host we see all Checks Green and OK, only Check_MK Discovery is CRIT.

I tried following with no success delete Host and create new one.

Can you give me a tip what could be the problem.

Hi,

SNMP and the ILO Boards are a never ending story :slight_smile:

You can try the following:

  • Reboot the ilo board
  • Update the ilo board to the latest version
  • disable the “interfaces” and “if64” checks for the mgmg boards with a disabled checks rule
  • Use Andreas REST based Redfish plugin and get rid of SNMP (preferred solution)
    Checkmk Exchange

tried also following: Service check timeout (Microcore) to 2 Minutes.

Resulted in:

Error running automation call try-inventory: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.

I have also seen SNMP Timeouts over 60s with different Apollo Systems.

Beside @aeckstein’s suggestions what i also support. :wink:
I would ask for type of iLO on this Apollo systems.
On “normal” HPE servers the exclude of the

brings the most improvements.
The “disable_snmp_section” rule should also include all “hr_…”, “ucd_…” and “if…” SNMP sections.

In your server also the 28 hard drives are very time consuming at query and discovery time.
Unfortunately this will not go away with the usage of the Redfish interface.
But you should try how it works.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.