Automigration fails for builtin plugins

Hello,

I’m getting lots of warnings that the automigration failed – for builtin plugins, e.g. ~site/share/check_mk/checks/fast_lta_volumes. I verified that the contents on our file system match the contents of the current commit in the repo.

I feel like I’m missing context here – am I really supposed to manually edit builtin checks?

Output of cmk -R:

Failed to auto-migrate legacy plugin to section: fast_lta_volumes
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: juniper_trpz_cpu_util
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: juniper_trpz_flash
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: juniper_trpz_info
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: juniper_trpz_mem
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: juniper_trpz_power
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: printer_alerts
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: printer_input
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: printer_output
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: printer_supply
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: vutlan_ems_humidity
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: vutlan_ems_leakage
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: vutlan_ems_smoke
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: vutlan_ems_temp
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: watchdog_sensors
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: fast_lta_headunit.replication
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: fast_lta_headunit.status
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: fast_lta_silent_cubes.capacity
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: watchdog_sensors.dew
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: watchdog_sensors.humidity
Please refer to Werk 10601 for more information.
Failed to auto-migrate legacy plugin to section: watchdog_sensors.temp
Please refer to Werk 10601 for more information.
Generating configuration for core (type nagios)...

Precompiling host checks...OK
Validating Nagios configuration...OK
OK
Restarting monitoring core...OK
cmk -R  8.60s user 0.49s system 81% cpu 11.131 total

I hope this isn’t against the netiquette, but I’d like to bump this.

Does anyone know whether this is intentional? It seems strange to me that I should be the only one observing this.

Do you have any files inside the folder “~/local/share/check_mk/checks”?
Can you try the cmk -R in this way cmk --debug -vvR to get the maximum output.

The --debug flag causes direct abortion after the first attempted automigration:

1 monitoring@monitoring ~ % cmk --debug -vvR                                                                                                                                                                                               :(
Files: ([], None)
Traceback (most recent call last):
  File "/omd/sites/monitoring/lib/python3/cmk/base/config.py", line 1986, in _extract_agent_and_snmp_sections
    create_snmp_section_plugin_from_legacy(
  File "/omd/sites/monitoring/lib/python3/cmk/base/api/agent_based/register/section_plugins_legacy/__init__.py", line 235, in create_snmp_section_plugin_from_legacy
    detect_spec = create_detect_spec(
  File "/omd/sites/monitoring/lib/python3/cmk/base/api/agent_based/register/section_plugins_legacy/convert_scan_functions.py", line 397, in create_detect_spec
    _compute_detect_spec(
  File "/omd/sites/monitoring/lib/python3/cmk/base/api/agent_based/register/section_plugins_legacy/convert_scan_functions.py", line 365, in _compute_detect_spec
    scan_func_ast = _get_scan_function_ast(section_name, scan_function, fallback_files)
  File "/omd/sites/monitoring/lib/python3/cmk/base/api/agent_based/register/section_plugins_legacy/convert_scan_functions.py", line 104, in _get_scan_function_ast
    assert source != "", "Files: %r" % ((read_files, src_file_name),)
AssertionError: Files: ([], None)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/omd/sites/monitoring/bin/cmk", line 79, in <module>
    errors = config.load_all_agent_based_plugins(check_api.get_check_api_context)
  File "/omd/sites/monitoring/lib/python3/cmk/base/config.py", line 1415, in load_all_agent_based_plugins
    errors.extend(load_checks(get_check_api_context, filelist))
  File "/omd/sites/monitoring/lib/python3/cmk/base/config.py", line 1563, in load_checks
    errors = (_extract_agent_and_snmp_sections(validate_creation_kwargs=did_compile) +
  File "/omd/sites/monitoring/lib/python3/cmk/base/config.py", line 2005, in _extract_agent_and_snmp_sections
    raise MKGeneralException(exc) from exc
cmk.utils.exceptions.MKGeneralException: Files: ([], None)
1 monitoring@monitoring ~ %  

This is the case regardless of whether there are contents in ~/local/share/check_mk/checks or not.
The folder contains a few checks, but they seem to be unrelated to the error message:
monitoring@monitoring ~ % ls ~/local/share/check_mk/checks
agdsn_ladon_temperature bacula_jobs mikrotik_signal_sixty mikrotik_signal_sixty_remote mikrotik_signal_sixty_remote_stats

I’m having trouble sharing the -vvR output without --debug, because it’s a ton of output and the migration error messages for whatever reason are not carried along pipes like | tee cmk.log. Investigating that.

Adding a breakpoint() before the assert on L104, and continuing until source is empty:

(Pdb) where
  /omd/sites/monitoring/bin/cmk(79)<module>()
-> errors = config.load_all_agent_based_plugins(check_api.get_check_api_context)
  /omd/sites/monitoring/lib/python3/cmk/base/config.py(1415)load_all_agent_based_plugins()
-> errors.extend(load_checks(get_check_api_context, filelist))
  /omd/sites/monitoring/lib/python3/cmk/base/config.py(1563)load_checks()
-> errors = (_extract_agent_and_snmp_sections(validate_creation_kwargs=did_compile) +
  /omd/sites/monitoring/lib/python3/cmk/base/config.py(1986)_extract_agent_and_snmp_sections()
-> create_snmp_section_plugin_from_legacy(
  /omd/sites/monitoring/lib/python3/cmk/base/api/agent_based/register/section_plugins_legacy/__init__.py(235)create_snmp_section_plugin_from_legacy()
-> detect_spec = create_detect_spec(
  /omd/sites/monitoring/lib/python3/cmk/base/api/agent_based/register/section_plugins_legacy/convert_scan_functions.py(398)create_detect_spec()
-> _compute_detect_spec(
  /omd/sites/monitoring/lib/python3/cmk/base/api/agent_based/register/section_plugins_legacy/convert_scan_functions.py(366)_compute_detect_spec()
-> scan_func_ast = _get_scan_function_ast(section_name, scan_function, fallback_files)
> /omd/sites/monitoring/lib/python3/cmk/base/api/agent_based/register/section_plugins_legacy/convert_scan_functions.py(105)_get_scan_function_ast()
-> assert source != "", "Files: %r" % ((read_files, src_file_name),)
(Pdb) dir()
['fallback_files', 'name', 'read_files', 'snmp_scan_function', 'source', 'src_file_name']
(Pdb) print(source)

(Pdb) print(name)
fast_lta_volumes
(Pdb) print(src_file_name)
None
(Pdb) print(fallback_files)
[]

So for some reason the source filename is empty. Investigating further.

That reason is because inspect.getfilename cannot return anything because the arg function is a lambda, as we can see from the source code I linked:

check_info["fast_lta_volumes"] = {
    # ...
    "snmp_scan_function": lambda oid: (oid(".1.3.6.1.2.1.1.2.0").startswith(
        ".1.3.6.1.4.1.8072.3.2.10") and oid(".1.3.6.1.4.1.27417.5.1.1.2")),
}

As i installed some 2.0 systems now and also upgraded two working 1.6 systems i had not such an error message.
Was this system upgraded or a clean new installation?

The system has been upgraded from the latest 1.6 release, whose configuration did to my knowledge not emit any warnings. Unfortunately nobody kept an upgrade log.

Given that these are builtin checks, I would have assumed that all of them would be migrated regardless of configuration context. Perhaps it’s only done on-demand.

I’m going to look at the source a bit deeper and try to see why these checks are migrated (perhaps it’s only done when some service uses it, and I can identify what the service configuration or discovery data is on that one).

You don’t need to look at the source. This is a problem with one of your modified installed checks.
What happens if you move all the files from “~/local/share/check_mk/checks/” somewhere and try the “cmk -R”?
Or what happens if you create one site at the same machine with 1 or 2 hosts inside to check if your installation is broken?

1 Like

Sorry it took so long to respond, I’ve been unavailable for a while. Thank you so much for your assistance so far.

As expected, cmk -vvR with removed ~/local/share/check_mk/checks directory has no effect, since all the error-inducing checks are builtin.

A new check_mk site with the same version does not have these symptoms (i.e. cmk -vvR does not yield any negative results). The only difference is that the new site is fresh and hasn’t undergone the 1.6→2.0 update.

Interestingly, a direct „site-by-site“ comparison (pun intended) of the files named after one check (fast_lta_volumes) yields the following:

root@monitoring ~ # md5sum ~monitoring{,_clean}/tmp/check_mk/check_includes/builtin/fast_lta_volumes
68b329da9893e34099c7d8ad5cb9c940  /omd/sites/monitoring/tmp/check_mk/check_includes/builtin/fast_lta_volumes
68b329da9893e34099c7d8ad5cb9c940  /omd/sites/monitoring_clean/tmp/check_mk/check_includes/builtin/fast_lta_volumes

But also

130 root@monitoring ~ # md5sum ~monitoring{,_clean}/var/check_mk/precompiled_checks/builtin/fast_lta_volumes                                                                                                                                :(
9faf94c307530d4d698d82c786d624b4  /omd/sites/monitoring/var/check_mk/precompiled_checks/builtin/fast_lta_volumes
c65d649da472524faff05058bbdbaa11  /omd/sites/monitoring_clean/var/check_mk/precompiled_checks/builtin/fast_lta_volumes

i.e., the checks seem to be the same, but the precompiled checks differ.

Perhaps I should try removing the precompiled_checks folder?

A-HA! That was the culprit.
(Re-)moving ~/var/check_mk/precompiled_checks resolved all the warnings / backtraces and cmk -vvR --debug went through.

Thanks for your suggestion to compare to a clean site!

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact @fayepal if you think this should be re-opened.