Check_MK Discovery crashing on ibm_svc_systemstats check

Using CMK 2.0.0p12 (CEE), I am monitoring our IBM Flash System 7200.

The check worked without issues until we upgraded the FS7200 firmware from 8.4 to 8.5, the check started to crash.

ValueError (invalid literal for int() with base 10: ‘0.000’)

Output of the traceback is below.

  File "/omd/sites/nagios/lib/python3/cmk/base/decorator.py", line 37, in wrapped_check_func
    status, infotexts, long_infotexts, perfdata = check_func(hostname, *args, **kwargs)
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 787, in check_discovery
    services, host_label_discovery_result = _get_host_services(
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 1599, in _get_host_services
    services, host_label_discovery_result = _get_node_services(
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 1620, in _get_node_services
    services, host_label_discovery_result = _get_discovered_services(
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 1648, in _get_discovered_services
    discovered_services, host_label_discovery_result = _discover_host_labels_and_services(
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 1420, in _discover_host_labels_and_services
    discovered_services = [] if discovery_parameters.only_host_labels else _discover_services(
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 1469, in _discover_services
    service_table.update({
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 1469, in <dictcomp>
    service_table.update({
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 1537, in _execute_discovery
    yield from _enriched_discovered_services(hostname, check_plugin.name, plugins_services)
  File "/omd/sites/nagios/lib/python3/cmk/base/discovery.py", line 1551, in _enriched_discovered_services
    for service in plugins_services:
  File "/omd/sites/nagios/lib/python3/cmk/base/api/agent_based/register/check_plugins.py", line 72, in filtered_generator
    for element in generator(*args, **kwargs):
  File "/omd/sites/nagios/lib/python3/cmk/base/api/agent_based/register/check_plugins_legacy.py", line 84, in discovery_migration_wrapper
    original_discovery_result = disco_func(section)
  File "/omd/sites/nagios/share/check_mk/checks/ibm_svc_systemstats", line 133, in inventory_ibm_svc_systemstats_iops
    return [(key, None) for key in ibm_svc_systemstats_parse(info)]
  File "/omd/sites/nagios/share/check_mk/checks/ibm_svc_systemstats", line 69, in ibm_svc_systemstats_parse
    parsed["VDisks"][stat_name] = int(stat_current)

The output of local variables is:

{'_stat_peak': '0.000',
 '_stat_peak_time': '220505072556',
 'info': [['stat_name', 'stat_current', 'stat_peak', 'stat_peak_time'],
          ['compression_cpu_pc', '0', '0', '220505072556'],
          ['cpu_pc', '3', '5', '220505072531'],
          ['fc_mb', '0', '0', '220505072556'],
          ['fc_io', '0', '14', '220505072305'],
          ['sas_mb', '0', '0', '220505072556'],
          ['sas_io', '0', '0', '220505072556'],
          ['iscsi_mb', '0', '0', '220505072556'],
          ['iscsi_io', '0', '0', '220505072556'],
          ['write_cache_pc', '32', '32', '220505072556'],
          ['total_cache_pc', '48', '48', '220505072556'],
          ['vdisk_mb', '0', '0', '220505072556'],
          ['vdisk_io', '0', '15', '220505072305'],
          ['vdisk_ms', '0.099', '0.309', '220505072215'],
          ['mdisk_mb', '0', '318', '220505072526'],
          ['mdisk_io', '0', '980', '220505072531'],
          ['mdisk_ms', '0.662', '0.882', '220505072526'],
          ['drive_mb', '91', '430', '220505072526'],
          ['drive_io', '364', '1745', '220505072526'],
          ['drive_ms', '0.983', '1.738', '220505072546'],
          ['vdisk_r_mb', '0', '0', '220505072556'],
          ['vdisk_r_io', '0', '0', '220505072556'],
          ['vdisk_r_ms', '0.000', '0.000', '220505072556'],
          ['vdisk_w_mb', '0', '0', '220505072556'],
          ['vdisk_w_io', '0', '15', '220505072305'],
          ['vdisk_w_ms', '0.099', '0.309', '220505072215'],
          ['mdisk_r_mb', '0', '220', '220505072531'],
          ['mdisk_r_io', '0', '965', '220505072531'],
          ['mdisk_r_ms', '0.000', '0.430', '220505072526'],
          ['mdisk_w_mb', '0', '205', '220505072526'],
          ['mdisk_w_io', '0', '91', '220505072526'],
          ['mdisk_w_ms', '0.662', '3.203', '220505072526'],
          ['drive_r_mb', '90', '264', '220505072531'],
          ['drive_r_io', '362', '1072', '220505072531'],
          ['drive_r_ms', '0.984', '1.743', '220505072546'],
          ['drive_w_mb', '0', '251', '220505072526'],
          ['drive_w_io', '0', '1015', '220505072526'],
          ['drive_w_ms', '0.488', '0.921', '220505072526'],
          ['power_w', '670', '672', '220505072556'],
          ['temp_c', '21', '22', '220505072551'],
          ['temp_f', '69', '71', '220505072551'],
          ['iplink_mb', '0', '0', '220505072556'],
          ['iplink_io', '0', '0', '220505072556'],
          ['iplink_comp_mb', '0', '0', '220505072556'],
          ['cloud_up_mb', '0', '0', '220505072556'],
          ['cloud_up_ms', '0', '0', '220505072556'],
          ['cloud_down_mb', '0', '0', '220505072556'],
          ['cloud_down_ms', '0', '0', '220505072556'],
          ['iser_mb', '0', '0', '220505072556'],
          ['iser_io', '0', '0', '220505072556']],
 'parsed': {'VDisks': {'r_io': 0, 'r_mb': 0}},
 'stat_current': '0.000',
 'stat_name': 'r_ms'}

I suspect something changed with the firmware. CMK does return inventory on the host and checks are created, just the Check_MK Discovery that is crashing.

Anyone run into something similar or any idea how to fix? I have already submitted the crash report to CMK.

Hi,

looks like that your system send the wrong structure back after firmware update. Did you have an old output to compare what’s wrong? Looks like that the float values in the agent data caous teh problem.

Regards, Christian

Unfortunately I don’t have the return values prior to upgrading from 8.4 to 8.5, all I know is that after the upgrade the agent started to crash. Hoping this will grab the attention of CMK developers, the ibm_svc check will need some updating to work with latest IBM firmware.

Schnelle Lösung für das Problem - workaround sollte aber funktionieren.

/omd/sites/nagios/share/check_mk/checks/ibm_svc_systemstats

kopieren nach

/omd/sites/nagios/local/share/check_mk/checks/ibm_svc_systemstats

dann dort Zeile 69 von

    parsed["VDisks"][stat_name] = int(stat_current)
nach
    parsed["VDisks"][stat_name] = int(float(stat_current))

mal ändern. Scheinbar gab es bei der alten Firmware nie Fließkommazahlen im Output. Diese sind ja jetzt vereinzelt vorhanden.

4 Likes

Thanks Andreas,

That worked, I had to add the float() in a few places, its no longer crashing now.

Line 69: parsed["VDisks"][stat_name] = int(float(stat_current))
Line 75: parsed["MDisks"][stat_name] = int(float(stat_current))
Line 81: parsed["Drives"][stat_name] = int(float(stat_current))

Re-inventory of the host did not work, had to remove host from CMK and re-add for it to pick up changes in /omd/sites/nagios/local/share/check_mk/checks/ibm_svc_systemstats.

Works now, just as a FYI for anyone else that comes across this.

2 Likes

Exactly what I was looking for!

I avoided having to reconfigure the host by doing an omd restart after making the float changes to the check.

Thank you very much, I also had a similar problem after upgrade to 8.5 firmware on IBM V5000 system, and this solution worked fine!

Kann ebenfalls bestätigen, dass bei neueren SVC Firmwares teils Fließkomma-Zahlen geliefert werden.

Ergänzende Information: Betraf bei mir neben ibm_svc_systemstats auch ibm_svc_nodestats: stat_current = int(data["stat_current"])

Tipp: int(float…)) ist auf der sicheren Seite. Zumindest bei mir kamen aber auch alle zugehörigen Checks mit float(…) zurecht und die Perfdata schreiben dann schön die Mikro-Sekunden Werte mit (0.xxx ms) :slight_smile:

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.