Redfish Plugin discovery crashes with HPE iLO6

CMK version: Checkmk Enterprise Edition 2.3.0p17
OS version: Debian 12

Error message: Periodic service discovery crashes with “check failed - please submit a crash report!” message

Output of “cmk --debug -vvn hostname”:

Summary
value store: synchronizing
Trying to acquire lock on /omd/sites/prod/tmp/check_mk/counters/csesx01-ilo
Got lock on /omd/sites/prod/tmp/check_mk/counters/csesx01-ilo
value store: loading from disk
Releasing lock on /omd/sites/prod/tmp/check_mk/counters/csesx01-ilo
Released lock on /omd/sites/prod/tmp/check_mk/counters/csesx01-ilo
Checkmk version 2.3.0p17
+ FETCHING DATA
  Source: SourceInfo(hostname='csesx01-ilo', ipaddress='172.22.3.181', ident='special_redfish', fetcher_type=<FetcherType.SPECIAL_AGENT: 6>, source_type=<SourceType.HOST: 1>)
[cpu_tracking] Start [7f22576cd130]
Read from cache: AgentFileCache(csesx01-ilo, path_template=/omd/sites/prod/tmp/check_mk/data_source_cache/special_redfish/{hostname}, max_age=MaxAge(checking=0, discovery=2700.0, inventory=2700.0), simulation=False, use_only_cache=False, file_cache_mode=6)
Not using cache (Too old. Age is 974 sec, allowed is 0 sec)
Calling: /omd/sites/prod/local/lib/python3/cmk/plugins/redfish/libexec/agent_redfish -u checkmk --password-id uuid93cbad9e-849b-438a-afa1-f23159086e22:/omd/sites/prod/var/check_mk/passwords_merged -P https --timeout 20 172.22.3.181
Get data from program
Write data to cache file /omd/sites/prod/tmp/check_mk/data_source_cache/special_redfish/csesx01-ilo
Trying to acquire lock on /omd/sites/prod/tmp/check_mk/data_source_cache/special_redfish/csesx01-ilo
Got lock on /omd/sites/prod/tmp/check_mk/data_source_cache/special_redfish/csesx01-ilo
Releasing lock on /omd/sites/prod/tmp/check_mk/data_source_cache/special_redfish/csesx01-ilo
Released lock on /omd/sites/prod/tmp/check_mk/data_source_cache/special_redfish/csesx01-ilo
[cpu_tracking] Stop [7f22576cd130 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.55, children_system=0.04, elapsed=3.8599999994039536))]
  Source: SourceInfo(hostname='csesx01-ilo', ipaddress='172.22.3.181', ident='piggyback', fetcher_type=<FetcherType.PIGGYBACK: 4>, source_type=<SourceType.HOST: 1>)
[cpu_tracking] Start [7f22576cd130]
Read from cache: NoCache(csesx01-ilo, path_template=/dev/null, max_age=MaxAge(checking=0.0, discovery=0.0, inventory=0.0), simulation=False, use_only_cache=False, file_cache_mode=1)
No piggyback files for 'csesx01-ilo'. Skip processing.
No piggyback files for '172.22.3.181'. Skip processing.
Get piggybacked data
[cpu_tracking] Stop [7f22576cd130 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
[cpu_tracking] Start [7f22576351c0]
+ PARSE FETCHER RESULTS
<<<check_mk:sep(32)>>> / Transition NOOPParser -> HostSectionParser
<<<redfish_manager:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_system:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_ethernetinterfaces:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_processors:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_memory:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_storage:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_networkinterfaces:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_drives:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_volumes:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_ethernetinterfaces:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_chassis:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_power:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_networkadapters:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<redfish_thermal:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
  HostKey(hostname='csesx01-ilo', source_type=<SourceType.HOST: 1>)  -> Add sections: ['check_mk', 'redfish_chassis', 'redfish_drives', 'redfish_ethernetinterfaces', 'redfish_manager', 'redfish_memory', 'redfish_networkadapters', 'redfish_networkinterfaces', 'redfish_power', 'redfish_processors', 'redfish_storage', 'redfish_system', 'redfish_thermal', 'redfish_volumes']
  HostKey(hostname='csesx01-ilo', source_type=<SourceType.HOST: 1>)  -> Add sections: []
Received no piggyback data
Perfdata(name='02-CPU 1 PkgTmp', value=54.0, levels_upper=('fixed', (95.0, 95.0)), levels_lower=None, boundaries=(None, None))
Perfdata(name='03-CPU 2 PkgTmp', value=67.0, levels_upper=('fixed', (95.0, 95.0)), levels_lower=None, boundaries=(None, None))
CPU 1                Type: CPU, Model: Intel(R) Xeon(R) Gold 6442Y, Cores: 24, Threads: 48, Speed maximum 4000 MHz
CPU 2                Type: CPU, Model: Intel(R) Xeon(R) Gold 6442Y, Cores: 24, Threads: 48, Speed maximum 4000 MHz
Check_MK Agent       Version: 2.0, OS: iLO 6 - 1.63
Drive 0-480GB 6G SATA SSD Size: 447GB, Speed 6.0 Gbs, Media Life Left: 100%
Drive 1-480GB 6G SATA SSD Size: 447GB, Speed 6.0 Gbs, Media Life Left: 100%
Fan 1                Speed: 19.0%
Fan 2                Speed: 19.0%
Fan 3                Speed: 19.0%
Fan 4                Speed: 19.0%
Fan 5                Speed: 19.0%
Fan 6                Speed: 19.0%
Fan 7                Speed: 19.0%
Memory Summary       Capacity: 512GB, with State: Rollup State: Normal
Memory proc1dimm10   Size: 64GB, Type: DDR5-4800 MultiBitECC
Memory proc1dimm14   Size: 64GB, Type: DDR5-4800 MultiBitECC
Memory proc1dimm3    Size: 64GB, Type: DDR5-4800 MultiBitECC
Memory proc1dimm7    Size: 64GB, Type: DDR5-4800 MultiBitECC
Memory proc2dimm10   Size: 64GB, Type: DDR5-4800 MultiBitECC
Memory proc2dimm14   Size: 64GB, Type: DDR5-4800 MultiBitECC
Memory proc2dimm3    Size: 64GB, Type: DDR5-4800 MultiBitECC
Memory proc2dimm7    Size: 64GB, Type: DDR5-4800 MultiBitECC
Network adapter DA000000 Model: BCM 5719 1Gb 4p BASE-T OCP Adptr, SeNr: 1CH0150001, PartNr: BCM95719N1905HC
Network adapter DE080000 Model: HPE SN1610Q 32Gb 2p FC HBA, SeNr: MY540220V7, PartNr: R2E09A
Network adapter DE081000 Model: BCM57416, SeNr: VNM3390CJ0, PartNr: P26255-001
Network adapter DE082000 Model: BCM57416, SeNr: VNM3390GM5, PartNr: P26255-001
Physical port 10     Link: LinkUp, Speed: 1000Mbps, MAC: 04:32:01:9b:5f:85
Physical port 11     Link: Unknown, Speed: 0Mbps, MAC: 04:32:01:9b:5f:86
Physical port 12     Link: Unknown, Speed: 0Mbps, MAC: 04:32:01:9b:5f:87
Physical port 141    Link: LinkUp, Speed: 10000Mbps, MAC: 04:32:01:b6:30:00
Physical port 142    Link: LinkUp, Speed: 10000Mbps, MAC: 04:32:01:b6:30:01
Physical port 77     Link: LinkUp, Speed: 10000Mbps, MAC: 04:32:01:b7:92:50
Physical port 78     Link: LinkUp, Speed: 10000Mbps, MAC: 04:32:01:b7:92:51
Physical port 9      Link: LinkUp, Speed: 1000Mbps, MAC: 04:32:01:9b:5f:84
Power supply 0-HpeServerPowerSupply 0.0 Watts input, 201.0 Watts output, 228.0 V input, Capacity 800.0 Watts, Typ 865438-B21
Power supply 1-HpeServerPowerSupply 0.0 Watts input, 174.0 Watts output, 228.0 V input, Capacity 800.0 Watts, Typ 865438-B21
Storage controller DE00C000 Everything looks OK - 1 detail available
System state         System with SerialNr: CZJD0L00DK, has State: Component State: Normal, Rollup State: Normal, This resource is enabled.
Temperature 02-CPU 1 PkgTmp Temperature: 54.0 °C
Temperature 03-CPU 2 PkgTmp Temperature: 67.0 °C
Volume 239           Raid Type: RAID1, Size: 446.6GB
No piggyback files for 'csesx01-ilo'. Skip processing.
No piggyback files for '172.22.3.181'. Skip processing.
[cpu_tracking] Stop [7f22576351c0 - Snapshot(process=posix.times_result(user=0.030000000000000027, system=0.010000000000000009, children_user=0.0, children_system=0.0, elapsed=0.03999999910593033))]
[special_redfish] Success, [piggyback] Success (but no data found for this host), execution time 3.9 sec | execution_time=3.900 user_time=0.030 system_time=0.010 children_user_time=0.550 children_system_time=0.040 cmk_time_ds=3.270 cmk_time_agent=0.000

Output of “cmk --check-discovery hostname --debug -v”:

Summary
+ FETCHING DATA
Get data from program
No piggyback files for 'csesx01-ilo'. Skip processing.
No piggyback files for '172.22.3.181'. Skip processing.
Get piggybacked data
+ EXECUTING DISCOVERY PLUGINS (14)
Traceback (most recent call last):
  File "/omd/sites/prod/bin/cmk", line 114, in <module>
    exit_status = modes.call(mode_name, mode_args, opts, args)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3/cmk/base/modes/__init__.py", line 70, in call
    return handler(*handler_args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3/cmk/base/modes/check_mk.py", line 1639, in mode_check_discovery
    with error_handler:
  File "/omd/sites/prod/lib/python3/cmk/base/errorhandling/_handler.py", line 66, in __exit__
    *_handle_failure(
     ^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3/cmk/base/errorhandling/_handler.py", line 103, in _handle_failure
    raise exc
  File "/omd/sites/prod/lib/python3/cmk/base/modes/check_mk.py", line 1642, in mode_check_discovery
    checks_result = execute_check_discovery(
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3/cmk/checkengine/discovery/_impl.py", line 143, in execute_check_discovery
    services = get_host_services(
               ^^^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3/cmk/checkengine/discovery/_autodiscovery.py", line 605, in get_host_services
    **_get_node_services(
      ^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3/cmk/checkengine/discovery/_autodiscovery.py", line 659, in _get_node_services
    discovered_services = discover_services(
                          ^^^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3/cmk/checkengine/discovery/_services.py", line 125, in discover_services
    {
  File "/omd/sites/prod/lib/python3/cmk/checkengine/discovery/_services.py", line 186, in _discover_plugins_services
    yield from plugin.function(check_plugin_name, **kwargs)
  File "/omd/sites/prod/lib/python3/cmk/base/checkers.py", line 884, in __discovery_function
    yield from (
  File "/omd/sites/prod/lib/python3/cmk/base/checkers.py", line 884, in <genexpr>
    yield from (
               ^
  File "/omd/sites/prod/lib/python3/cmk/base/api/agent_based/register/check_plugins.py", line 71, in filtered_generator
    for element in generator(*args, **kwargs):
  File "/omd/sites/prod/local/lib/python3/cmk/plugins/redfish/agent_based/redfish_voltage.py", line 32, in discovery_redfish_voltage
    yield Service(item=entry["Name"])
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3.12/site-packages/cmk/agent_based/v1/_checking_classes.py", line 117, in __new__
    item=cls._parse_item(item),
         ^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/prod/lib/python3.12/site-packages/cmk/agent_based/v1/_checking_classes.py", line 128, in _parse_item
    raise TypeError(f"'item' must be a non empty string or ommited entirely, got {item!r}")
TypeError: 'item' must be a non empty string or ommited entirely, got ''

Hi
I updated yesterday from 2.2 to 2.3 and switched from Andreas’ HP iLO Plugin to the new builtin Redfish Plugin to monitor HPE iLO devices. All devices we have with iLO6 are now crashing with the periodic service discovery. Manual discovery seems to be working though and monitoring too. Older generations of iLO don’t have a problem.
Our firmware wasn’t on the newest version, we had v1.58, so we updated iLO firmware to the newest available v1.63, but that didn’t change anything.

Hi Andreas,

did you install the latest version of the redfish plugin (2.3.60) ?
The built in version is usually a few versions behind the current development.
If the problem still exists, you can send Andreas the output of the ILOs as described here

I have several ILO 6 with 1.63 running without problems, it might be a special hardware configuration on your side.

2 Likes

What version of the Redfish plugin do you use?
If it is the included one, then please update to the latest release.
This error looks like one already fixed.

Thanks for your quick answers. I used the included version and didn’t think about possibly updating this. I will give it a try now.

Ok so I updated from the builtin version 2.3.38 to version 2.3.60 and then everything was broken completely :smiley:
I searched this forum for the error message and found this

And that was the solution to the new problem. Now everything works and my inital problem is also gone.

Thanks again!

1 Like