Check_MK RAW 2.0 to 2.1 -> All Docker check Warning

Hello,

I just upgraded the server from version 2.0.0p24 to 2.1.0p2.
I also upgraded the check-mk agents to 2.1.0p2.

Since then, the check CPU utilization, Disk IO SUMMARY and Memory of all my containers are Vanished.

There is no agent inside the docker containers, the information is retrieved by piggyback.

If I do a Rescan, I get this Warning:

> WARNING: Parsing of section docker_container_diskstat failed - please submit a crash report! (Crash-ID: 90182e0e-f140-11ec-8aa6-c609533838e6)
> WARNING: Parsing of section docker_container_labels failed - please submit a crash report! (Crash-ID: 9062e9a8-f140-11ec-8aa6-c609533838e6)
> WARNING: Parsing of section docker_container_cpu failed - please submit a crash report! (Crash-ID: 90912872-f140-11ec-8aa6-c609533838e6)
> WARNING: Parsing of section docker_container_mem failed - please submit a crash report! (Crash-ID: 90c069f2-f140-11ec-8aa6-c609533838e6)
> WARNING: Parsing of section docker_container_network failed - please submit a crash report! (Crash-ID: 90f09424-f140-11ec-8aa6-c609533838e6)

If I run check_mk_agent locally on the server, it seems to retrieve the information correctly (example for a single container):

<<<<rancid_nexus>>>>
<<<docker_container_mem:sep(124)>>>
@docker_version_info|{"PluginVersion": "0.1", "DockerPyVersion": "4.1.0", "ApiVersion": "1.41"}
<<<docker_container_mem:sep(0)>>>
{"usage": 44523520, "stats": {"active_anon": 0, "active_file": 11460608, "anon": 15167488, "anon_thp": 0, "file": 22573056, "file_dirty": 135168, "file_mapped": 7163904, "file_writeback": 0, "inactive_anon": 15052800, "inactive_file": 11075584, "kernel_stack": 393216, "pgactivate": 157113, "pgdeactivate": 1853, "pgfault": 418886061, "pglazyfree": 0, "pglazyfreed": 0, "pgmajfault": 132, "pgrefill": 4940, "pgscan": 32766, "pgsteal": 29844, "shmem": 0, "slab": 4979472, "slab_reclaimable": 4374456, "slab_unreclaimable": 605016, "sock": 0, "thp_collapse_alloc": 990, "thp_fault_alloc": 111111, "unevictable": 0, "workingset_activate": 0, "workingset_nodereclaim": 0, "workingset_refault": 0}, "limit": 17044328448}
<<<<>>>>
<<<<rancid_nexus>>>>
<<<docker_container_cpu:sep(124)>>>
@docker_version_info|{"PluginVersion": "0.1", "DockerPyVersion": "4.1.0", "ApiVersion": "1.41"}
<<<docker_container_cpu:sep(0)>>>
{"cpu_usage": {"total_usage": 11138271306000, "usage_in_kernelmode": 3283542459000, "usage_in_usermode": 7854728847000}, "system_cpu_usage": 13979536670000000, "online_cpus": 2, "throttling_data": {"periods": 0, "throttled_periods": 0, "throttled_time": 0}}
<<<<>>>>
<<<<rancid_nexus>>>>
<<<docker_container_diskstat:sep(124)>>>
@docker_version_info|{"PluginVersion": "0.1", "DockerPyVersion": "4.1.0", "ApiVersion": "1.41"}
<<<docker_container_diskstat:sep(0)>>>
{"io_service_bytes_recursive": [{"major": 8, "minor": 0, "op": "read", "value": 135168}, {"major": 8, "minor": 0, "op": "write", "value": 704512}, {"major": 254, "minor": 2, "op": "read", "value": 1384448}, {"major": 254, "minor": 2, "op": "write", "value": 2003542016}, {"major": 8, "minor": 32, "op": "read", "value": 13086720}, {"major": 8, "minor": 32, "op": "write", "value": 2003628032}, {"major": 254, "minor": 1, "op": "read", "value": 11702272}, {"major": 254, "minor": 1, "op": "write", "value": 86016}], "io_serviced_recursive": null, "io_queue_recursive": null, "io_service_time_recursive": null, "io_wait_time_recursive": null, "io_merged_recursive": null, "io_time_recursive": null, "sectors_recursive": null, "time": 1655802087.5058012, "names": {"254:1": "dm-1", "8:16": "sdb", "254:2": "dm-2", "254:0": "dm-0", "8:32": "sdc", "8:0": "sda", "254:3": "dm-3"}}
<<<<>>>>

Do you have any idea what the problem is?
Thanks in advance,
Mick

In addition :
cmk --debug -vvII rancid_nexus

Trying to acquire lock on /omd/sites/lyo2/var/check_mk/crashes/base/16efa592-f15a-11ec-907f-c609533838e6/crash.info
Got lock on /omd/sites/lyo2/var/check_mk/crashes/base/16efa592-f15a-11ec-907f-c609533838e6/crash.info
Releasing lock on /omd/sites/lyo2/var/check_mk/crashes/base/16efa592-f15a-11ec-907f-c609533838e6/crash.info
Released lock on /omd/sites/lyo2/var/check_mk/crashes/base/16efa592-f15a-11ec-907f-c609533838e6/crash.info
Traceback (most recent call last):
  File "/omd/sites/lyo2/bin/cmk", line 92, in <module>
    exit_status = modes.call(mode_name, mode_args, opts, args)
  File "/omd/sites/lyo2/lib/python3/cmk/base/modes/__init__.py", line 69, in call
    return handler(*handler_args)
  File "/omd/sites/lyo2/lib/python3/cmk/base/modes/check_mk.py", line 1699, in mode_discover
    discovery.commandline_discovery(
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/__init__.py", line 182, in commandline_discovery
    _commandline_discovery_on_host(
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/__init__.py", line 249, in _commandline_discovery_on_host
    host_labels = analyse_node_labels(
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/_host_labels.py", line 79, in analyse_node_labels
    discovered_host_labels=_discover_host_labels(
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/_host_labels.py", line 214, in _discover_host_labels
    **_discover_host_labels_for_source_type(
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/_host_labels.py", line 237, in _discover_host_labels_for_source_type
    parsed_results = parsed_sections_broker.all_parsing_results(host_key)
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", line 293, in all_parsing_results
    return sorted(resolver.resolve_all(parser), key=lambda r: r.section.name)
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", line 203, in <genexpr>
    if (res := self.resolve(parser, psn)) is not None
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", line 188, in resolve
    if (parsing_result := parser.parse(producer)) is not None:
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", line 104, in parse
    if (parsed := self._parse_raw_data(section)) is None
  File "/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", line 122, in _parse_raw_data
    return section.parse_function(list(raw_data))
  File "/omd/sites/lyo2/lib/python3/cmk/base/plugins/agent_based/inventory_docker_container_network.py", line 18, in parse_docker_container_network
    return docker.parse(string_table).data
  File "/omd/sites/lyo2/lib/python3/cmk/base/plugins/agent_based/utils/docker.py", line 99, in parse
    raise ValueError(
ValueError: Expected list of length 2. First element list of 2 strings, second element list of 1 string

cat /omd/sites/lyo2/var/check_mk/crashes/base/16efa592-f15a-11ec-907f-c609533838e6/crash.info

{"time": 1655812977.2687418, "os": "Debian GNU/Linux 11 (bullseye)", "version": "2.1.0p2", "edition": "cre", "core": "nagios", "python_version": "3.9.10 (main, May 11 2022, 22:14:42) \n[GCC 11.2.0]", "python_paths": ["/opt/omd/versions/2.1.0p2.cre/bin", "/omd/sites/lyo2/local/lib/python3", "/omd/sites/lyo2/lib/python3/plus", "/omd/sites/lyo2/lib/python39.zip", "/omd/sites/lyo2/lib/python3.9", "/omd/sites/lyo2/lib/python3.9/lib-dynload", "/omd/sites/lyo2/lib/python3.9/site-packages", "/omd/sites/lyo2/lib/python3"], "id": "16efa592-f15a-11ec-907f-c609533838e6", "crash_type": "base", "exc_type": "ValueError", "exc_value": "Expected list of length 2. First element list of 2 strings, second element list of 1 string", "exc_traceback": [["/omd/sites/lyo2/bin/cmk", 92, "<module>", "exit_status = modes.call(mode_name, mode_args, opts, args)"], ["/omd/sites/lyo2/lib/python3/cmk/base/modes/__init__.py", 69, "call", "return handler(*handler_args)"], ["/omd/sites/lyo2/lib/python3/cmk/base/modes/check_mk.py", 1699, "mode_discover", "discovery.commandline_discovery("], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/__init__.py", 182, "commandline_discovery", "_commandline_discovery_on_host("], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/__init__.py", 249, "_commandline_discovery_on_host", "host_labels = analyse_node_labels("], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/_host_labels.py", 79, "analyse_node_labels", "discovered_host_labels=_discover_host_labels("], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/_host_labels.py", 214, "_discover_host_labels", "**_discover_host_labels_for_source_type("], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/discovery/_host_labels.py", 237, "_discover_host_labels_for_source_type", "parsed_results = parsed_sections_broker.all_parsing_results(host_key)"], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", 293, "all_parsing_results", "return sorted(resolver.resolve_all(parser), key=lambda r: r.section.name)"], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", 203, "<genexpr>", "if (res := self.resolve(parser, psn)) is not None"], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", 188, "resolve", "if (parsing_result := parser.parse(producer)) is not None:"], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", 104, "parse", "if (parsed := self._parse_raw_data(section)) is None"], ["/omd/sites/lyo2/lib/python3/cmk/base/agent_based/data_provider.py", 122, "_parse_raw_data", "return section.parse_function(list(raw_data))"], ["/omd/sites/lyo2/lib/python3/cmk/base/plugins/agent_based/inventory_docker_container_network.py", 18, "parse_docker_container_network", "return docker.parse(string_table).data"], ["/omd/sites/lyo2/lib/python3/cmk/base/plugins/agent_based/utils/docker.py", 99, "parse", "raise ValueError("]], "local_vars": "", "details": {"argv": ["/omd/sites/lyo2/bin/cmk", "--debug", "-vvII", "rancid_nexus"], "env": {"SHELL": "/bin/bash", "OMD_ROOT": "/omd/sites/lyo2", "HISTCONTROL": "", "NAGIOS_PLUGIN_STATE_DIRECTORY": "/omd/sites/lyo2/var/monitoring-plugins", "HISTSIZE": "10000", "HISTTIMEFORMAT": "%d/%m/%y %T ", "PWD": "/omd/sites/lyo2", "LOGNAME": "lyo2", "MANPATH": "/omd/sites/lyo2/share/man::/opt/puppetlabs/puppet/share/man", "TEMPDIR": "/tmp/user/998", "MODULEBUILDRC": "/omd/sites/lyo2/.modulebuildrc", "HOME": "/omd/sites/lyo2", "LANG": "C.UTF-8", "HISTFILE": "/omd/sites/lyo2/.bash_history", "TMPDIR": "/tmp/user/998", "PROMPT_COMMAND": "history -a", "PERL5LIB": "/omd/sites/lyo2/local/lib/perl5/lib/perl5:/omd/sites/lyo2/lib/perl5/lib/perl5:", "OMD_SITE": "lyo2", "TERM": "xterm-256color", "USER": "lyo2", "TEMP": "/tmp/user/998", "PERL_MM_OPT": "INSTALL_BASE=/omd/sites/lyo2/local/lib/perl5/", "SHLVL": "1", "LD_LIBRARY_PATH": "/omd/sites/lyo2/local/lib:/omd/sites/lyo2/lib", "REQUESTS_CA_BUNDLE": "/omd/sites/lyo2/var/ssl/ca-certificates.crt", "LC_ALL": "C.UTF-8", "TMOUT": "14400", "TMP": "/tmp/user/998", "PATH": "/omd/sites/lyo2/lib/perl5/bin:/omd/sites/lyo2/local/bin:/omd/sites/lyo2/bin:/omd/sites/lyo2/local/lib/perl5/bin:/usr/local/bin:/usr/bin:/bin:/opt/puppetlabs/bin", "HISTIGNORE": "", "MP_STATE_DIRECTORY": "/omd/sites/lyo2/var/monitoring-plugins", "HISTFILESIZE": "999999", "MAIL": "/var/mail/lyo2", "MAILRC": "/omd/sites/lyo2/etc/mail.rc", "_": "/omd/sites/lyo2/bin/cmk"}}}

I think the problem is there, but I haven’t found anything about how to fix it.

Thanks,
Mick

Is the used Docker plugin also version 2.1?
If it is the older Docker plugin it is possible that the result looks like yours.

Hi andreas-doehler,

Yes, I’ve deployed the last version of plugin too :

# grep "__version__" /usr/lib/check_mk_agent/plugins/mk_docker.py
__version__ = "2.1.0p2"

[quote=“Mickelebof, post:2, topic:32146”]
`ValueError: Expected

Do you have two docker hosts in Checkmk which run a docker container with the same name?

Hello chauhan_sudhir,

Yes, all my containers are duplicated on some hosts.
It’s not possible anymore ? :face_with_spiral_eyes:
It worked well in version 2.0.

Mick

You can try these two options to solve your problem:

  1. Use container id as host name
  2. Use “Hostname translation for piggybacked hosts” with different translations for the two parent hosts

Thanks chauhan_sudhir.

  1. I can’t use the ID because it may change regularly.
  2. I will look at this option but for the moment I do not understand it :sweat_smile:

It’s a pity because it worked very well with version 2.0.
You just had to declare a Docker container as “hosts” in checkmk (with no IP) and with piggyback it detected it on all hosts.
If the container stopped on one of one hosts, the alarm indicated on which hosts it stopped.
It was great :slight_smile:

You need a “Hostname translation for piggybacked hosts” rule for each and every server running Docker.

This for example would prefix each Docker container on server1 with the hostname. (server1-container_foo, server1-container_bar etc.)

For a few Docker servers this is working well. If you have lots of them, there’s currently no automatic way to make containers both unique to CMK and human readable.

1 Like

Thanks msommer,

Yes, I have a lot of containers, and I don’t want to have to do too much configuration to supervise them.
It is not critical, in fact there is only the CPU, Memory and IO Disk which do not work any more, there remains nevertheless the state of the containers, I will be satisfied with that, thank you :wink:

Mick