Windows 11 22H2 Agent "API Error:Error running automation call <tt>diag-host</tt>"

CMK version: 2.1.0
OS version: Windows 11 (22H2)

Followed the entire tutorial available at Monitoring Windows - The new agent for Windows in detail

:white_check_mark: Updated and checked firewall rules
:white_check_mark: Was able to register a new host on the server via cmk-agent-ctl
:white_check_mark: Was able to establish connection / create the certificate / etc
:white_check_mark: nc -zv 6556 from the checkmk server to the windows is returning ok (proving the port communication is working well)
:white_check_mark: Trust relationship status seems ok:

PS C:\Program Files (x86)\checkmk\service> .\cmk-agent-ctl.exe status
Version: 2.1.0p18
Agent socket: operational
IP allowlist: any


Connection: monoliththree:8000/monitoring
        UUID: 7d4ec82a-a353-46c9-9f9e-f756538b6e3b
        Local:
                Connection type: pull-agent
                Certificate issuer: Site 'monitoring' local CA
                Certificate validity: Wed, 25 Jan 2023 05:06:33 +0000 - Mon, 28 May 3021 05:06:33 +0000
        Remote:
                Connection type: pull-agent
                Registration state: operational
                Host name: monolithone

However the connection test fails with the following error:

Error message:
API Error:Error running automation call <tt>diag-host</tt>: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.

I have no idea what to do next, any ideas?
Best Regards,

Rogerio Hirooka

In the monitoring server i would do as the site user a “cmk --debug -vvI hostname” to check what really happens.
The error message you see can also mean you have a problem on your monitoring server itself.

For context the monitoring host is named monoliththree;

Ran the debug command, with the following results:

OMD[monitoring]:/opt/omd/versions/2.1.0p18.cre/bin$ cmk --debug -vvI monoliththree
Discovering services and host labels on: monoliththree
monoliththree:
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.PROGRAM
[cpu_tracking] Start [7f4e2cb92a30]
[ProgramFetcher] Fetch with cache settings: DefaultAgentFileCache(monoliththree, base_path=/omd/sites/monitoring/tmp/check_mk/data_source_cache/special_vsphere, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Does not exist)
[ProgramFetcher] Execute data source
Calling: /omd/sites/monitoring/share/check_mk/agents/special/agent_vsphere '-u' 'root' '-s=<redacted>' '-i' 'hostsystem,virtualmachine,datastore,counters' '--direct' '--hostname' 'monoliththree' '-P' '--spaces' 'underscore' '--no-cert-check' '127.0.1.1'
[cpu_tracking] Stop [7f4e2cb92a30 - Snapshot(process=posix.times_result(user=0.0, system=0.010000000000000009, children_user=0.22, children_system=0.02, elapsed=0.25))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7f4e2dc6beb0]
[PiggybackFetcher] Fetch with cache settings: NoCache(monoliththree, base_path=/omd/sites/monitoring/tmp/check_mk/data_source_cache/piggyback, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=True, use_outdated=False, simulation=False)
Not using cache (Cache usage disabled)
[PiggybackFetcher] Execute data source
No piggyback files for 'monoliththree'. Skip processing.
No piggyback files for '127.0.1.1'. Skip processing.
Not using cache (Cache usage disabled)
[cpu_tracking] Stop [7f4e2dc6beb0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
+ PARSE FETCHER RESULTS
  Source: SourceType.HOST/FetcherType.PROGRAM
  -> Not adding sections: Agent exited with code 1: HTTPSConnectionPool(host='127.0.1.1', port=443): Max retries exceeded with url: /sdk (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f696fcb31c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
  Source: SourceType.HOST/FetcherType.PIGGYBACK
No persisted sections
  -> Add sections: []
Received no piggyback data
Received no piggyback data
+ ANALYSE DISCOVERED HOST LABELS
Trying host label discovery with: 
Trying host label discovery with: 
SUCCESS - Found no new host labels
+ ANALYSE DISCOVERED SERVICES
+ EXECUTING DISCOVERY PLUGINS (0)
  Trying discovery with: 
SUCCESS - Found no new services

Is there anything helpful?

hostname needs to be the name of the Windows machine you want to monitor.

Thanks!
Output is as follows:

Windows host is ‘monolithone’:

OMD[monitoring]:/opt/omd/versions/2.1.0p18.cre/bin$ cmk --debug -vvI monolithone
Discovering services and host labels on: monolithone
monolithone:
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.PROGRAM
[cpu_tracking] Start [7f9cfe118340]
[ProgramFetcher] Fetch with cache settings: DefaultAgentFileCache(monolithone, base_path=/omd/sites/monitoring/tmp/check_mk/data_source_cache/special_vsphere, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Does not exist)
[ProgramFetcher] Execute data source
Calling: /omd/sites/monitoring/share/check_mk/agents/special/agent_vsphere '-u' 'root' '-s=<redacted>' '-i' 'hostsystem,virtualmachine,datastore,counters' '--direct' '--hostname' 'monolithone' '-P' '--spaces' 'underscore' '--no-cert-check' '192.168.1.238'
[cpu_tracking] Stop [7f9cfe118340 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.24, children_system=0.02, elapsed=131.26999999955297))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7f9cfe118730]
[PiggybackFetcher] Fetch with cache settings: NoCache(monolithone, base_path=/omd/sites/monitoring/tmp/check_mk/data_source_cache/piggyback, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=True, use_outdated=False, simulation=False)
Not using cache (Cache usage disabled)
[PiggybackFetcher] Execute data source
No piggyback files for 'monolithone'. Skip processing.
No piggyback files for '192.168.1.238'. Skip processing.
Not using cache (Cache usage disabled)
[cpu_tracking] Stop [7f9cfe118730 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
+ PARSE FETCHER RESULTS
  Source: SourceType.HOST/FetcherType.PROGRAM
  -> Not adding sections: Agent exited with code 1: HTTPSConnectionPool(host='192.168.1.238', port=443): Max retries exceeded with url: /sdk (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fedad29a1c0>: Failed to establish a new connection: [Errno 110] Connection timed out'))
  Source: SourceType.HOST/FetcherType.PIGGYBACK
No persisted sections
  -> Add sections: []
Received no piggyback data
Received no piggyback data
+ ANALYSE DISCOVERED HOST LABELS
Trying host label discovery with: 
Trying host label discovery with: 
SUCCESS - Found no new host labels
+ ANALYSE DISCOVERED SERVICES
+ EXECUTING DISCOVERY PLUGINS (0)
  Trying discovery with: 
SUCCESS - Found no new services

This host is configured as a vCenter server.
I think this is not correct or?

That’s odd - will remove it and recreate it - I do have a vSphere host but it’s another one and is working ok. Let me check…

Removed, recreated, but still the issue persists. I’m not sure why it’s calling the agent_vsphere upon diagnostics: just added the new host, specified the IP, and went for the testing phase. On the host, I did call the agent register which succeeded:

RcyJSnqM3oWfSEA5aB8gfV7qaZLkswc1nmWcIvnI1dy9EUiZ1w==
-----END CERTIFICATE-----

Issued by:
        Site 'monitoring' local CA
Issued to:
        monitoring
Validity:
        From Sun, 08 Jan 2023 08:37:32 +0000
        To   Fri, 11 May 3021 08:37:32 +0000

Do you want to establish this connection? [Y/n]
> Y

C:\Program Files (x86)\checkmk\service>

But when testing on the monitoring site, it still returns the same error

API Error:Error running automation call <tt>diag-host</tt>: Your request timed out after 110 seconds. This issue may be related to a local configuration problem or a request which works with a too large number of objects. But if you think this issue is a bug, please send a crash report.

And when calling the cmk debug, it still defaults to the agent_vsphere for some reason…

OMD[monitoring]:/opt/omd/versions/2.1.0p18.cre/bin$ cmk --debug -vvI monolithone
Discovering services and host labels on: monolithone
monolithone:
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.PROGRAM
[cpu_tracking] Start [7fdd98a405e0]
[ProgramFetcher] Fetch with cache settings: DefaultAgentFileCache(monolithone, base_path=/omd/sites/monitoring/tmp/check_mk/data_source_cache/special_vsphere, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Does not exist)
[ProgramFetcher] Execute data source
Calling: /omd/sites/monitoring/share/check_mk/agents/special/agent_vsphere '-u' 'root' '-s=<redacted>' '-i' 'hostsystem,virtualmachine,datastore,counters' '--direct' '--hostname' 'monolithone' '-P' '--spaces' 'underscore' '--no-cert-check' '192.168.1.238'
[cpu_tracking] Stop [7fdd98a405e0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.24, children_system=0.03, elapsed=129.48999999836087))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7fdd98a406d0]
[PiggybackFetcher] Fetch with cache settings: NoCache(monolithone, base_path=/omd/sites/monitoring/tmp/check_mk/data_source_cache/piggyback, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=True, use_outdated=False, simulation=False)
Not using cache (Cache usage disabled)
[PiggybackFetcher] Execute data source
No piggyback files for 'monolithone'. Skip processing.
No piggyback files for '192.168.1.238'. Skip processing.
Not using cache (Cache usage disabled)
[cpu_tracking] Stop [7fdd98a406d0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
+ PARSE FETCHER RESULTS
  Source: SourceType.HOST/FetcherType.PROGRAM
  -> Not adding sections: Agent exited with code 1: HTTPSConnectionPool(host='192.168.1.238', port=443): Max retries exceeded with url: /sdk (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f4d4f0611c0>: Failed to establish a new connection: [Errno 110] Connection timed out'))
  Source: SourceType.HOST/FetcherType.PIGGYBACK
No persisted sections
  -> Add sections: []
Received no piggyback data
Received no piggyback data
+ ANALYSE DISCOVERED HOST LABELS
Trying host label discovery with: 
Trying host label discovery with: 
SUCCESS - Found no new host labels
+ ANALYSE DISCOVERED SERVICES
+ EXECUTING DISCOVERY PLUGINS (0)
  Trying discovery with: 
SUCCESS - Found no new services

Yes, it was the folder rule configuration… The ESXi config was applied to the main folder… I reorganized with one folder for the vSpheres and another for Windows hosts: and all tests succeeded now.

Thank you @andreas-doehler for the help and insights!

OMD[monitoring]:/opt/omd/versions/2.1.0p18.cre/bin$ cmk --debug -vvI monolithone
Discovering services and host labels on: monolithone
monolithone:
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.TCP
[cpu_tracking] Start [7f6bdf604580]
[TCPFetcher] Fetch with cache settings: DefaultAgentFileCache(monolithone, base_path=/omd/sites/monitoring/tmp/check_mk/cache, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Using data from cache file /omd/sites/monitoring/tmp/check_mk/cache/monolithone
Got 69920 bytes data from cache
[TCPFetcher] Use cached data
[cpu_tracking] Stop [7f6bdf604580 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7f6bdf604a90]
[PiggybackFetcher] Fetch with cache settings: NoCache(monolithone, base_path=/omd/sites/monitoring/tmp/check_mk/data_source_cache/piggyback, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=True, use_outdated=False, simulation=False)
Not using cache (Cache usage disabled)
[PiggybackFetcher] Execute data source
No piggyback files for 'monolithone'. Skip processing.
No piggyback files for '192.168.1.238'. Skip processing.
Not using cache (Cache usage disabled)
[cpu_tracking] Stop [7f6bdf604a90 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
+ PARSE FETCHER RESULTS
  Source: SourceType.HOST/FetcherType.TCP
<<<check_mk>>> / Transition NOOPParser -> HostSectionParser
<<<cmk_agent_ctl_status:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<wmi_cpuload:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
<<<uptime>>> / Transition HostSectionParser -> HostSectionParser
<<<fileinfo:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
<<<mem>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_phydisk>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_if>>> / Transition HostSectionParser -> HostSectionParser
<<<winperf_processor>>> / Transition HostSectionParser -> HostSectionParser
<<<df:sep(9)>>> / Transition HostSectionParser -> HostSectionParser
<<<checkmk_agent_plugins_win:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<logwatch>>> / Transition HostSectionParser -> HostSectionParser
<<<services>>> / Transition HostSectionParser -> HostSectionParser
<<<ps:sep(9)>>> / Transition HostSectionParser -> HostSectionParser
<<<dotnet_clrmemory:sep(124)>>> / Transition HostSectionParser -> HostSectionParser
Transition HostSectionParser -> NOOPParser
Transition NOOPParser -> NOOPParser
<<<systemtime>>> / Transition NOOPParser -> HostSectionParser
No persisted sections
  -> Add sections: ['check_mk', 'checkmk_agent_plugins_win', 'cmk_agent_ctl_status', 'df', 'dotnet_clrmemory', 'fileinfo', 'logwatch', 'mem', 'ps', 'services', 'systemtime', 'uptime', 'winperf_if', 'winperf_phydisk', 'winperf_processor', 'wmi_cpuload']
  Source: SourceType.HOST/FetcherType.PIGGYBACK
No persisted sections
  -> Add sections: []
Received no piggyback data
Received no piggyback data
+ ANALYSE DISCOVERED HOST LABELS
Trying host label discovery with: check_mk, checkmk_agent_plugins_win, cmk_agent_ctl_status, df, dotnet_clrmemory, fileinfo, logwatch, mem, ps, services, systemtime, uptime, winperf_if, winperf_phydisk, winperf_processor, wmi_cpuload
  cmk/os_family: windows (check_mk)
Trying host label discovery with: 
Trying to acquire lock on /omd/sites/monitoring/var/check_mk/discovered_host_labels/monolithone.mk
Got lock on /omd/sites/monitoring/var/check_mk/discovered_host_labels/monolithone.mk
Releasing lock on /omd/sites/monitoring/var/check_mk/discovered_host_labels/monolithone.mk
Released lock on /omd/sites/monitoring/var/check_mk/discovered_host_labels/monolithone.mk
SUCCESS - Found no new host labels
+ ANALYSE DISCOVERED SERVICES
+ EXECUTING DISCOVERY PLUGINS (27)
  Trying discovery with: uptime, fileinfo_groups, logwatch_ec, mem_vmalloc, logwatch_groups, winperf_if, services, systemtime, mssql_datafiles, mem_win, docker_container_status_uptime, checkmk_agent, ps, fileinfo, winperf_phydisk, mssql_transactionlogs, df, wmi_cpuload, dotnet_clrmemory, mem_linux, logwatch_ec_single, services_summary, domino_tasks, esx_vsphere_hostsystem_cpu_usage, logwatch, winperf_processor_util, check_mk_only_from
SUCCESS - Found no new services

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.