The strange world of checkmk packages/plugins

@andreas-doehler apparently you’ve already encountered this error context :slight_smile:

You cannot execute checks directly with Python.
All the checks are executed or used if you use the “cmk” command as the site user.
Also CheckMK brings it’s own Python version. It is not relevant what is installed in your system directly.

For your problem you need to do two steps.

  • check the ~/tmp/check_mk/cache/ for the file with your hostname
    inside this file you need to see your dfs section
  • if the section is there you can try with “cmk --debug -vvI hostname” if a new check is discovered
  • you can also use “–detect-sections” with your dfs section name to force the system to look for your section.
1 Like

Hello,

Thank you, i check the section, so for dfs_backlog, the section is checkmk_agent_plugins_win :

<<<checkmk_agent_plugins_win:sep(0)>>>
pluginsdir C:\ProgramData\checkmk\agent\plugins
localdir C:\ProgramData\checkmk\agent\local
C:\ProgramData\checkmk\agent\plugins\dfs_backlog.ps1:CMK_VERSION = unversioned

I conclude that dfs_backlog section is checkmk_agent_plugins_win

This section is parsed but i do not find dfs_backlog in the result check :

cpu_tracking] Stop [7f5b32897190 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
+ PARSE FETCHER RESULTS
<<<check_mk>>> / Transition NOOPParser -> HostSectionParser
<<<cmk_agent_ctl_status:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<wmi_cpuload:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
<<<uptime>>> / Transition HostSectionParser -> HostSectionParser
<<<mem>>> / Transition HostSectionParser -> HostSectionParser
<<<fileinfo:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_phydisk>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_if>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_processor>>> / Transition HostSectionParser -> HostSectionParser
<<<logwatch>>> / Transition HostSectionParser -> HostSectionParser
<<<checkmk_agent_plugins_win:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<dotnet_clrmemory:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
<<<services>>> / Transition HostSectionParser -> HostSectionParser
<<<df:sep(9)>>> / Transition HostSectionParser -> HostSectionParser
<<<ps:sep(9)>>> / Transition HostSectionParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
Transition NOOPParser -> NOOPParser
<<<systemtime>>> / Transition NOOPParser -> HostSectionParser
  HostKey(hostname='MYSERVER', source_type=<SourceType.HOST: 1>)  -> Add sections: ['check_mk', 'checkmk_agent_plugins_win', 'cmk_agent_ctl_status', 'df', 'dotnet_clrmemory', 'fileinfo', 'logwatch', 'mem', 'ps', 'services', 'systemtime', 'uptime', 'winperf_if', 'winperf_phydisk', 'winperf_processor', 'wmi_cpuload']
  HostKey(hostname='MYSERVER', source_type=<SourceType.HOST: 1>)  -> Add sections: []
Received no piggyback data
+ ANALYSE DISCOVERED HOST LABELS
Trying host label discovery with: check_mk, checkmk_agent_plugins_win, cmk_agent_ctl_status, df, dotnet_clrmemory, fileinfo, logwatch, mem, ps, services, systemtime, uptime, winperf_if, winperf_phydisk, winperf_processor, wmi_cpuload
  cmk/os_family: windows (check_mk)
Trying host label discovery with: 
SUCCESS - Found 1 host labels
+ ANALYSE DISCOVERED SERVICES
+ EXECUTING DISCOVERY PLUGINS (27)
  Trying discovery with: services_summary, uptime, check_mk_only_from, logwatch, fileinfo_groups, logwatch_ec, winperf_processor_util, winperf_phydisk, mssql_transactionlogs, df, systemtime, docker_container_status_uptime, mem_vmalloc, esx_vsphere_hostsystem_cpu_usage, ps, logwatch_ec_single, services, winperf_if, fileinfo, domino_tasks, dotnet_clrmemory, mssql_datafiles, logwatch_groups, mem_linux, checkmk_agent, mem_win, wmi_cpuload
SUCCESS - Found no new services
  • When i use “–detect-sections” i have this result :
+ PARSE FETCHER RESULTS
<<<check_mk>>> / Transition NOOPParser -> HostSectionParser
<<<cmk_agent_ctl_status:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<wmi_cpuload:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
<<<uptime>>> / Transition HostSectionParser -> HostSectionParser
<<<fileinfo:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
<<<mem>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_phydisk>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_if>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_processor>>> / Transition HostSectionParser -> HostSectionParser
<<<logwatch>>> / Transition HostSectionParser -> HostSectionParser
<<<checkmk_agent_plugins_win:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<services>>> / Transition HostSectionParser -> HostSectionParser
<<<dotnet_clrmemory:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
<<<df:sep(9)>>> / Transition HostSectionParser -> HostSectionParser
<<<ps:sep(9)>>> / Transition HostSectionParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
Transition NOOPParser -> NOOPParser
<<<systemtime>>> / Transition NOOPParser -> HostSectionParser
  HostKey(hostname='MYSERVER', source_type=<SourceType.HOST: 1>)  -> Add sections: ['checkmk_agent_plugins_win']
  HostKey(hostname='MYSERVER', source_type=<SourceType.HOST: 1>)  -> Add sections: []
Received no piggyback data
+ EXECUTING INVENTORY PLUGINS
 allnet_ip_sensoric: skipped (no data)
 allnet_ip_sensoric: skipped (no data)
 aruba_wlc_aps: skipped (no data)
 aruba_wlc_aps: skipped (no data)
 azure_app_gateway: skipped (no data)
 azure_app_gateway: skipped (no data)
 check_mk: skipped (no data)
 check_mk: skipped (no data)
 checkmk_agent_plugins: ok
 checkmk_agent_plugins: skipped (no data)

I cannot found checkmk_agent_plugins_win section

On the Windows side server this command test shows :

C:\Program Files x86\checkmk\service> .\check_mk_agent.exe test

<<<dfs_backlog:sep(59)>>>
EDI (DENOTMS750);0
EDI (LOCALHOST);1
SHARE (DENOTMS750);0
SHARE (LOCALHOST);0
<<<>>>

So i’m confused with the section name : <<<dfs_backlog>>> or <<<checkmk_agent_plugins_win>>>

Best regards

I’m not a windows admin but could this be a permission issue?
The checkmkService runs with NT-Authority/SYSTEM, and according to checkmk_dfs_backlog/agents/plugins/dfs_backlog.ps1 at main · ellr/checkmk_dfs_backlog · GitHub you need permissions to do the necessary WMI queries.

Just for a test, can you run the checkmkService with your user and see if that changes what you see when you run

cmk -d <hostname_with_the_plugin> | grep -A 5 "<<<dfs_backlog"

1 Like

Thank you @gstolz for your help, I asked the windows experts to check the rights again, and it was indeed linked to a problem with execution rights on certain parts of the ps1 script

Thank you very much @andreas-doehler and @gstolz for this issue

When i launch a rescan of services on CheckMK webui i have this error :

Starting job...
WARNING: Parsing of section dfs_backlog failed - please submit a crash report! (Crash-ID: 3ab9ae1e-c502-11ee-8667-005056b8f7b9)
Completed.

Do you know how can we have a debug mode on this check ?

Best regards,

I found crashed reports in : ~/var/check_mk/crashes/section/ID/crash.info

{"time": 1707232812.487336, "os": "Red Hat Enterprise Linux release 8.9 (Ootpa)", "version": "2.2.0p20", "edition": "cce", "core": "nagios", "python_version": "3.11.5 (main, Dec 1 2023, 14:58:52) [GCC 13.2.0]", "python_paths": ["/opt/omd/versions/2.2.0p20.cce/bin", "/omd/sites/monitoring/local/lib/python3", "/omd/sites/monitoring/lib/python3/cloud", "/omd/sites/monitoring/lib/python311.zip", "/omd/sites/monitoring/lib/python3.11", "/omd/sites/monitoring/lib/python3.11/lib-dynload", "/omd/sites/monitoring/lib/python3.11/site-packages", "/omd/sites/monitoring/lib/python3"], "id": "391f6ade-c503-11ee-b855-005056b8f7b9", "crash_type": "section", "exc_type": "IndexError", "exc_value": "list index out of range", "exc_traceback": [["/omd/sites/monitoring/lib/python3/cmk/base/agent_based/data_provider.py", 106, "_parse_raw_data", "return parse_function(list(raw_data))"], ["/omd/sites/monitoring/local/lib/python3/cmk/base/plugins/agent_based/dfs_backlog.py", 109, "parse_dfs_backlog", "return [DfsReplication.from_string_table(line) for line in string_table]"], ["/omd/sites/monitoring/local/lib/python3/cmk/base/plugins/agent_based/dfs_backlog.py", 109, "<listcomp>", "return [DfsReplication.from_string_table(line) for line in string_table]"], ["/omd/sites/monitoring/local/lib/python3/cmk/base/plugins/agent_based/dfs_backlog.py", 75, "from_string_table", "direction: str = descr_raw[2]"]], "local_vars": "eydkZXNjcl9yYXcnOiBbJ0VESScsICcoREVIQU1NUzEzNzgnXSwKICdsaW5lJzogWydFREkgKERFSEFNTVMxMzc4KScsICcxJ10sCiAnc2hhcmVfbmFtZSc6ICdFREknfQ==", "details": {"section_name": "dfs_backlog", "section_content": [["SHARE1 (REMOTE_HOST)", "1"], ["SHARE1 (LOCALHOST)", "0"], ["SHARE2 (REMOTE_HOST)", "0"], ["SHARE2 (LOCALHOST)", "0"]], "host_name": "MYSERVER"}}

can you share the full agent output as well? or rather the full dfs_backlog section?

If this is the real agent output then something went wrong on your data producing host.
The agent output need to be exactly like the example.

        FOO_DATA ( from foohost);0
        FOO_DATA ( to foohost);0
        Archive ( from foohost);0
        Archive ( to foohost);0

This would result on the CMK side to

[["FOO_DATA ( from foohost)", "0"],["FOO_DATA ( to foohost)", "0"],["Archive ( from foohost)", "0"],["Archive ( to foohost)", "0"]]

The real important point is the first element in the list.
The plugin separates it at every space into a new list and uses the single elements than.
I would say very it is prone to errors.

Hi @andreas-doehler & @gstolz ,

this is the output of the of agent_output :

<<<dfs_backlog:sep(59)>>>
SHARE1 (HOST1);0
SHARE1 (LOCALHOST);0
SHARE2 (HOST1);0
SHARE2 (LOCALHOST);0
<<<>>>

Is there a way to fix the script ?

{"time": 1707234737.3121464, "os": "Red Hat Enterprise Linux release 8.9 (Ootpa)", "version": "2.2.0p20", "edition": "cce", "core": "nagios", "python_version": "3.11.5 (main, Dec  1 2023, 14:58:52) [GCC 13.2.0]", "python_paths": ["/opt/omd/versions/2.2.0p20.cce/bin", "/omd/sites/monitoring/local/lib/python3", "/omd/sites/monitoring/lib/python3/cloud", "/omd/sites/monitoring/lib/python311.zip", "/omd/sites/monitoring/lib/python3.11", "/omd/sites/monitoring/lib/python3.11/lib-dynload", 
"/omd/sites/monitoring/lib/python3.11/site-packages", "/omd/sites/monitoring/lib/python3"], "id": "b4684a18-c507-11ee-ba26-005056b8f7b9", "crash_type": "section", 
"exc_type": "IndexError", "exc_value": "list index out of range", "exc_traceback": [["/omd/sites/monitoring/lib/python3/cmk/base/agent_based/data_provider.py", 106, "_parse_raw_data", "return parse_function(list(raw_data))"], ["/omd/sites/monitoring/local/lib/python3/cmk/base/plugins/agent_based/dfs_backlog.py", 109, "parse_dfs_backlog", "return [DfsReplication.from_string_table(line) for line in string_table]"], ["/omd/sites/monitoring/local/lib/python3/cmk/base/plugins/agent_based/dfs_backlog.py", 109, "<listcomp>", "return [DfsReplication.from_string_table(line) for line in string_table]"], ["/omd/sites/monitoring/local/lib/python3/cmk/base/plugins/agent_based/dfs_backlog.py", 75, "from_string_table", "direction: str = descr_raw[2]"]], "local_vars": "eydkZXNjcl9yYXcnOiBbJ0VESScsICcoREVIQU1NUzEzNzgnXSwKICdsaW5lJzogWydFREkgKERFSEFNTVMxMzc4KScsICcwJ10sCiAnc2hhcmVfbmFtZSc6ICdFREknfQ==", "details": {"section_name": "dfs_backlog", "section_content": [["SHARE1 (HOST1)", "0"], ["SHARE1 (LOCALHOST)", "0"], ["SHARE2 (HOST1)", "0"], ["SHARE2 (LOCALHOST)", "0"]], "host_name": "MYSERVER"}}

You need to modify the Powershell script that it outputs the correct format.
The plugin uses the “from” and “to” to identify the direction of the connection.

Thanks a lot, I didn’t know if it was the python script or the ps1 that was causing the problem.

I will provide feedback

regards

Hi,

The DFSR backlog sensor is working

In the ps1 powershell script we have to change line 226 to 228 :

If ($BacklogCount -ne $Null)
						{
                    Write-Host -NoNewline $ReplicatedFolderName `($Smem`)";"$BacklogCount`n
						}
						else {
	            Write-Host -NoNewline $ReplicatedFolderName `($Smem`)";NULL"`n
						}

To

If ($BacklogCount -ne $Null)
						{
                    Write-Host -NoNewline $ReplicatedFolderName `( "from"$Smem`)";"$BacklogCount`n
                    Write-Host -NoNewline $ReplicatedFolderName `( "to"$Rmem`)";"$BacklogCount`n
						}
						else {
                    #Write-Host -NoNewline $ReplicatedFolderName `( "to"$Smem`)";NULL"`n

I wonder how this script worked before

Thank you @andreas-doehler and @gstolz for your patience and your help

regards

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.