Special_vsphere Timeout

Seit einigen Tagen habe ich bei dem Server andauernd Timeouts des special_vsphere Agenten.
Gibt es hier irgendwelche Ansätze?


Davor gab es keinerlei Probleme.

CMK version: 2.0.0p17
OS version: Appliance

Error message:

OMD[G302_Dresden]:~$ cmk --debug -vvn H-DRE-ESX-0001
Checkmk version 2.0.0p17
Try license usage history update.
Trying to acquire lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/next_run
Got lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/next_run
Trying to acquire lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/history.json
Got lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/history.json
Next run time has not been reached yet. Abort.
Releasing lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/history.json
Released lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/history.json
Releasing lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/next_run
Released lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/next_run
Loading autochecks from /omd/sites/G302_Dresden/var/check_mk/autochecks/H-DRE-ESX-0001.mk
+ FETCHING DATA
  Source: SourceType.MANAGEMENT/FetcherType.SNMP
[cpu_tracking] Start [7f80c14e24f0]
[SNMPFetcher] Fetch with cache settings: SNMPFileCache(base_path=PosixPath('/omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/mgmt_snmp/H-DRE-ESX-0001'), max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Too old. Age is 41 sec, allowed is 0 sec)
[SNMPFetcher] Execute data source
No persisted sections loaded
  SNMP scan:
       Getting OID .1.3.6.1.2.1.1.1.0: Running 'snmpget -v2c -c public -m "" -M "" -On -OQ -Oe -Ot 10.29.10.1 .1.3.6.1.2.1.1.1.0'
SNMP answer: ==> ["Integrated Lights-Out 5 2.30 Aug 24 2020"]
b'Integrated Lights-Out 5 2.30 Aug 24 2020'
       Getting OID .1.3.6.1.2.1.1.2.0: Running 'snmpget -v2c -c public -m "" -M "" -On -OQ -Oe -Ot 10.29.10.1 .1.3.6.1.2.1.1.2.0'
SNMP answer: ==> [.1.3.6.1.4.1.232.9.4.11]
b'.1.3.6.1.4.1.232.9.4.11'
       Using cached OID .1.3.6.1.2.1.1.1.0: 'Integrated Lights-Out 5 2.30 Aug 24 2020'
       Using cached OID .1.3.6.1.2.1.1.1.0: 'Integrated Lights-Out 5 2.30 Aug 24 2020'
       Using cached OID .1.3.6.1.2.1.1.1.0: 'Integrated Lights-Out 5 2.30 Aug 24 2020'
       Using cached OID .1.3.6.1.2.1.1.1.0: 'Integrated Lights-Out 5 2.30 Aug 24 2020'
       Using cached OID .1.3.6.1.2.1.1.1.0: 'Integrated Lights-Out 5 2.30 Aug 24 2020'
       Using cached OID .1.3.6.1.2.1.1.1.0: 'Integrated Lights-Out 5 2.30 Aug 24 2020'
       Using cached OID .1.3.6.1.2.1.1.2.0: '.1.3.6.1.4.1.232.9.4.11'
       Getting OID .1.3.6.1.2.1.2.2.1.*: Running 'snmpgetnext -Cf -v2c -c public -m "" -M "" -On -OQ -Oe -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1'
SNMP answer: ==> [1]
b'1'
       Using cached OID .1.3.6.1.2.1.1.2.0: '.1.3.6.1.4.1.232.9.4.11'
       Using cached OID .1.3.6.1.2.1.1.1.0: 'Integrated Lights-Out 5 2.30 Aug 24 2020'
   SNMP scan found                    if snmp_uptime
Trying to acquire lock on /omd/sites/G302_Dresden/tmp/check_mk/snmp_scan_cache/H-DRE-ESX-0001.10.29.10.1
Got lock on /omd/sites/G302_Dresden/tmp/check_mk/snmp_scan_cache/H-DRE-ESX-0001.10.29.10.1
Releasing lock on /omd/sites/G302_Dresden/tmp/check_mk/snmp_scan_cache/H-DRE-ESX-0001.10.29.10.1
Released lock on /omd/sites/G302_Dresden/tmp/check_mk/snmp_scan_cache/H-DRE-ESX-0001.10.29.10.1
hp_proliant_cpu: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.1.2.2.1.1.1'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.1.2.2.1.1.2'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.1.2.2.1.1.3'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.1.2.2.1.1.6'
hp_proliant_da_cntlr: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.2.1.1.1'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.2.1.1.2'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.2.1.1.5'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.2.1.1.6'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.2.1.1.9'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.2.1.1.10'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.2.1.1.12'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.2.1.1.15'
hp_proliant_da_phydrv: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.1'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.2'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.5'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.6'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.9'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.45'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.37'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.50'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.57'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.3'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.51'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.60'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.5.1.1.4'
hp_proliant_fans: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.7.1.2'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.7.1.3'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.7.1.4'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.7.1.6'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.7.1.9'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.7.1.12'
hp_proliant_mem: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.14.13.1.1'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.14.13.1.2'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.14.13.1.3'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.14.13.1.6'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.14.13.1.7'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.14.13.1.12'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.14.13.1.19'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.14.13.1.20'
hp_proliant_raid: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.3.1.1.2'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.3.1.1.14'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.3.1.1.4'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.3.1.1.9'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.3.2.3.1.1.12'
hp_proliant_temp: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.8.1.2'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.8.1.3'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.8.1.4'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.8.1.5'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.4.1.232.6.2.6.8.1.6'
if: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.1'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.2'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.3'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.5'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.8'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.10'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.11'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.12'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.13'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.14'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.16'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.17'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.18'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.19'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.20'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.21'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.2.2.1.6'
snmp_info: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.1.1'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.1.4'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.1.5'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.1.6'
snmp_uptime: Fetching data (SNMP walk cache is enabled: Use any locally cached information)
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.1.3'
Running 'snmpbulkwalk -Cr10 -v2c -c public -m "" -M "" -Cc -OQ -OU -On -Ot 10.29.10.1 .1.3.6.1.2.1.25.1.1'
Write data to cache file /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/mgmt_snmp/checking/H-DRE-ESX-0001
Trying to acquire lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/mgmt_snmp/checking/H-DRE-ESX-0001
Got lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/mgmt_snmp/checking/H-DRE-ESX-0001
Releasing lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/mgmt_snmp/checking/H-DRE-ESX-0001
Released lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/mgmt_snmp/checking/H-DRE-ESX-0001
[cpu_tracking] Stop [7f80c14e24f0 - Snapshot(process=posix.times_result(user=0.040000000000000036, system=0.09000000000000001, children_user=0.0, children_system=0.0, elapsed=1.7400000020861626))]
  Source: SourceType.HOST/FetcherType.PROGRAM
[cpu_tracking] Start [7f80c14e2100]
Calling: /omd/sites/G302_Dresden/share/check_mk/agents/special/agent_vsphere --pwstore=3@3@ESX_Standortserver '-u' 'root' '-s=*************' '-i' 'datastore,counters' '--direct' '--hostname' 'H-DRE-ESX-0001' '-P' '--spaces' 'underscore' '--vm_piggyname' 'alias' '--host_pwr_display' 'esxhost' '--snapshots-on-host' '--no-cert-check' '10.29.16.1'
[ProgramFetcher] Fetch with cache settings: DefaultAgentFileCache(base_path=PosixPath('/omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001'), max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Too old. Age is 840 sec, allowed is 0 sec)
[ProgramFetcher] Execute data source
Write data to cache file /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
Trying to acquire lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
Got lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
Releasing lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
Released lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
[cpu_tracking] Stop [7f80c14e2100 - Snapshot(process=posix.times_result(user=0.010000000000000009, system=0.0, children_user=165.56, children_system=0.06, elapsed=167.10999999940395))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7f80c1343070]
No piggyback files for 'H-DRE-ESX-0001'. Skip processing.
No piggyback files for '10.29.16.1'. Skip processing.
[PiggybackFetcher] Fetch with cache settings: NoCache(base_path=PosixPath('/omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/piggyback/H-DRE-ESX-0001'), max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
[PiggybackFetcher] Execute data source
[cpu_tracking] Stop [7f80c1343070 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
[cpu_tracking] Start [7f80c14df940]
+ PARSE FETCHER RESULTS
  Source: SourceType.MANAGEMENT/FetcherType.SNMP
No persisted sections loaded
  -> Add sections: ['hp_proliant_cpu', 'hp_proliant_da_cntlr', 'hp_proliant_da_phydrv', 'hp_proliant_fans', 'hp_proliant_mem', 'hp_proliant_raid', 'hp_proliant_temp', 'if', 'snmp_info', 'snmp_uptime']
  Source: SourceType.HOST/FetcherType.PROGRAM
No persisted sections loaded
  -> Add sections: ['esx_systeminfo', 'esx_vsphere_counters', 'esx_vsphere_datastores', 'systemtime']
  Source: SourceType.HOST/FetcherType.PIGGYBACK
No persisted sections loaded
  -> Add sections: []
Received no piggyback data
Loading item states
Trying to acquire lock on /omd/sites/G302_Dresden/tmp/check_mk/counters/H-DRE-ESX-0001
Got lock on /omd/sites/G302_Dresden/tmp/check_mk/counters/H-DRE-ESX-0001
Releasing lock on /omd/sites/G302_Dresden/tmp/check_mk/counters/H-DRE-ESX-0001
Released lock on /omd/sites/G302_Dresden/tmp/check_mk/counters/H-DRE-ESX-0001
Datastore IO SUMMARY PEND - Counter data is missing
Disk IO SUMMARY      Item not found in monitoring data
Filesystem RAID1_SSD 63.92% used (595.23 of 931.25 GB), trend: 0.00 B / 24 hours, Uncommitted: 23.58 GB, Provisioning: 66.45%
Interface 1          PEND - Counter data is missing
Management Interface: HW CPU 0 CPU0 "Intel(R) Xeon(R) E-2224 CPU @ 3.40GHz" in slot 0 is in state "ok"
Management Interface: HW Controller 1 Condition: ok, Board-Condition: ok, Board-Status: ok, (Role: other, Model: 91, Slot: 1, Serial: PEYHL0ARCC002F)
Management Interface: HW FAN1 (system) FAN Sensor 1 "system", Speed is normal, State is ok
Management Interface: HW Mem 0 Board: 0, Number: 0, Type: unknown (19), Size: 16.0 GiB, Status: good, Condition: ok
Management Interface: HW Mem 1 Board: 0, Number: 1, Type: unknown (19), Size: 16.0 GiB, Status: good, Condition: ok
Management Interface: HW Phydrv 1/0 Bay: 1, Bus number: 255, Status: ok, Smart status: ok, Ref hours: 7464, Size: 953869MB, Condition: ok
Management Interface: HW Phydrv 1/1 Bay: 2, Bus number: 255, Status: ok, Smart status: ok, Ref hours: 7464, Size: 953869MB, Condition: ok
Management Interface: Interface 1 [I350 Gigabit Network Connection], (up), MAC: B4:7A:F1:39:D8:CE, Speed: 1 GBit/s, In: 0.00 B/s (0%), Out: 0.00 B/s (0%)
Management Interface: Logical Device Status: OK, Logical volume size: 931.48 GB
Management Interface: SNMP Info Integrated Lights-Out 5 2.30 Aug 24 2020, ILO-H-DRE-ESX-0001.service.joba-group.local, unknown,
Management Interface: Temperature 1 (ambient) 28.0 °C
Management Interface: Temperature 10 (ioBoard) 39.0 °C
Management Interface: Temperature 11 (system) 47.0 °C
Management Interface: Temperature 2 (cpu) 40.0 °C
Management Interface: Temperature 3 (memory) 36.0 °C
Management Interface: Temperature 4 (system) 35.0 °C
Management Interface: Temperature 5 (system) 53.0 °C
Management Interface: Temperature 6 (system) 43.0 °C
Management Interface: Temperature 7 (system) 78.0 °C
Management Interface: Temperature 8 (system) 53.0 °C
Management Interface: Temperature 9 (ioBoard) 65.0 °C
Management Interface: Uptime Up since Sep 19 2021 23:45:38, Uptime: 88 days 13 hours
System Time          Offset: -14.7 ms
Uptime               PEND - Counter data is missing
VMKernel Swap        Swap in: not available, Swap out: not available, Swap used: not available
No piggyback files for 'H-DRE-ESX-0001'. Skip processing.
No piggyback files for '10.29.16.1'. Skip processing.
[cpu_tracking] Stop [7f80c14df940 - Snapshot(process=posix.times_result(user=0.010000000000000009, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.010000001639127731))]
[mgmt_snmp] Success, [special_vsphere] Version: unknown, OS: unknown, execution time 168.9 sec | execution_time=168.860 user_time=0.060 system_time=0.090 children_user_time=165.560 children_system_time=0.060 cmk_time_snmp=1.610 cmk_time_ds=1.480 cmk_time_agent=0.000

Deaktivier mal das Management Board im Checkmk. :slight_smile:

Hey das probier ich natürlich mal aus. Hab das Management Board aber eigentlich schon länger aktiv:(

Unsere Empfehlung: Mach für Management Boards einen eigenen Host nach einem entsprechenden Namensschema. Das hat verschiedene Vorteile. :slight_smile:

1 Like

Hi @Flolo,

such mal im Forum nach Management Board, da findest du viele Posts, wo die Einschränkungen und Nachteile der Funktion der Management Boards direkt auf dem Host erörtert werden.

Ich beobachte das jetzt erstmal bei dem einen Host und ändere es dann so wie ich das jetzt gelesen habe noch auf allen anderen ab.

Hab leider schon wieder Timeouts nach 120 Sekunden…

OMD[G302_Dresden]:~$ cmk --debug -vvn H-DRE-ESX-0001
Checkmk version 2.0.0p17
Try license usage history update.
Trying to acquire lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/next_run
Got lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/next_run
Trying to acquire lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/history.json
Got lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/history.json
Next run time has not been reached yet. Abort.
Releasing lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/history.json
Released lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/history.json
Releasing lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/next_run
Released lock on /omd/sites/G302_Dresden/var/check_mk/license_usage/next_run
Loading autochecks from /omd/sites/G302_Dresden/var/check_mk/autochecks/H-DRE-ESX-0001.mk
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.PROGRAM
[cpu_tracking] Start [7f1441ecf0d0]
Calling: /omd/sites/G302_Dresden/share/check_mk/agents/special/agent_vsphere --pwstore=3@3@ESX_Standortserver '-u' 'root' '-s=*************' '-i' 'datastore,counters' '--direct' '--hostname' 'H-DRE-ESX-0001' '-P' '--spaces' 'underscore' '--vm_piggyname' 'alias' '--host_pwr_display' 'esxhost' '--snapshots-on-host' '--no-cert-check' '10.29.16.1'
[ProgramFetcher] Fetch with cache settings: DefaultAgentFileCache(base_path=PosixPath('/omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001'), max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Too old. Age is 333 sec, allowed is 0 sec)
[ProgramFetcher] Execute data source
Write data to cache file /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
Trying to acquire lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
Got lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
Releasing lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
Released lock on /omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/special_vsphere/H-DRE-ESX-0001
[cpu_tracking] Stop [7f1441ecf0d0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=165.6, children_system=0.1, elapsed=167.14000000059605))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7f1441ebdac0]
No piggyback files for 'H-DRE-ESX-0001'. Skip processing.
No piggyback files for '10.29.16.1'. Skip processing.
[PiggybackFetcher] Fetch with cache settings: NoCache(base_path=PosixPath('/omd/sites/G302_Dresden/tmp/check_mk/data_source_cache/piggyback/H-DRE-ESX-0001'), max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
[PiggybackFetcher] Execute data source
[cpu_tracking] Stop [7f1441ebdac0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
[cpu_tracking] Start [7f14436417f0]
+ PARSE FETCHER RESULTS
  Source: SourceType.HOST/FetcherType.PROGRAM
No persisted sections loaded
  -> Add sections: ['esx_systeminfo', 'esx_vsphere_counters', 'esx_vsphere_datastores', 'systemtime']
  Source: SourceType.HOST/FetcherType.PIGGYBACK
No persisted sections loaded
  -> Add sections: []
Received no piggyback data
Loading item states
Trying to acquire lock on /omd/sites/G302_Dresden/tmp/check_mk/counters/H-DRE-ESX-0001
Got lock on /omd/sites/G302_Dresden/tmp/check_mk/counters/H-DRE-ESX-0001
Releasing lock on /omd/sites/G302_Dresden/tmp/check_mk/counters/H-DRE-ESX-0001
Released lock on /omd/sites/G302_Dresden/tmp/check_mk/counters/H-DRE-ESX-0001
Datastore IO SUMMARY PEND - Counter data is missing
Disk IO SUMMARY      Item not found in monitoring data
Filesystem RAID1_SSD 63.92% used (595.23 of 931.25 GB), trend: 0.00 B / 24 hours, Uncommitted: 23.58 GB, Provisioning: 66.45%
Interface 1          PEND - Counter data is missing
System Time          Offset: -4.70 ms
Uptime               PEND - Counter data is missing
VMKernel Swap        Swap in: not available, Swap out: not available, Swap used: not available
No piggyback files for 'H-DRE-ESX-0001'. Skip processing.
No piggyback files for '10.29.16.1'. Skip processing.
[cpu_tracking] Stop [7f14436417f0 - Snapshot(process=posix.times_result(user=0.010000000000000009, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.019999999552965164))]
[special_vsphere] Version: unknown, OS: unknown, execution time 167.2 sec | execution_time=167.160 user_time=0.010 system_time=0.000 children_user_time=165.600 children_system_time=0.100 cmk_time_ds=1.440 cmk_time_agent=0.000

Was ist denn mit dem vCenter, wurden daran Veränderungen vorgenommen? Oder an Komponenten auf dem Weg dahin?

Das ist ein normaler ESXi Host. Der wird von einer lokalen Checkmk Instanz überwacht, welche direkt auf dem Host liegt.

Änderungen gab es da nicht.

Wenn er wirklich hier nen Problem hat würde ich mal manuell das Command ausführen welches für diesen Host als Special Agent genutzt wird.

Einfach mal nen “cmk -D hostname” und dann den Special Agent Aufruf manuell ausführen.
Vielleicht siehst wo er hängen bleibt. Es ist auch möglich den Special Agent Aufruf mit der Ausgabe eines Tracefiles zu verbinden. Darin ist genau ersichtlich wo es Probleme gibt.

Ich habe jetzt gestern auch mal auf dem Host die Management Agents vom ESXi neugestartet. Seitdem funktionieren die Anfragen wieder einwandfrei…
Heute ist jetzt dafür der nächste Host an einem anderen Standort mit dauerhaften Timeouts gesegnet :sleepy:

Versuche ich den Special Agent auf der Konsole aufzurufen, passiert garnichts, wenn er in den Timeout läuft. Wenn es doch mal funktioniert zeigt er ganz normal die entsprechenden Leistungsdaten vom ESXi Host an.

Dann bitte den Special Agent mit den Optionen für ein Tracefile versehen. Dann siehst im Tracefile, dass er bestimmt schon beim Login hängen bleibt. Hatte ich jedenfalls schonmal gehabt. Sobald das Login geschafft ist muss es durchlaufen ohne Probleme.

        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196625</counterId><instance>vmnic0</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196625</counterId><instance>vmnic2</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196625</counterId><instance>vmnic1</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196625</counterId><instance>vmnic3</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196625</counterId><instance>vusb0</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196625</counterId><instance></instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196618</counterId><instance>vmnic0</instance></id><value>3</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196618</counterId><instance>vusb0</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196618</counterId><instance></instance></id><value>3</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196618</counterId><instance>vmnic1</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196618</counterId><instance>vmnic3</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>196618</counterId><instance>vmnic2</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>655362</counterId><instance>5f27c39a-9ff23458-cc5d-b47af1338cb8</instance></id><value>39</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>655364</counterId><instance>5f27c39a-9ff23458-cc5d-b47af1338cb8</instance></id><value>0</value></value><value\
        \ xsi:type=\"PerfMetricIntSeries\"><id><counterId>655371</counterId><instance>5f27c39a-9ff23458-cc5d-b47af1338cb8</instance></id><value>0</value></value></returnval></QueryPerfResponse>\n\
        </soapenv:Body>\n</soapenv:Envelope>"
    headers:
      Cache-Control:
      - no-cache
      Connection:
      - Keep-Alive
      Content-Type:
      - text/xml; charset=utf-8
      Date:
      - Tue, 21 Dec 2021 08:22:11 GMT
      Transfer-Encoding:
      - chunked
      X-Frame-Options:
      - DENY
    status:
      code: 200
      message: OK
- request:
    body: <SOAP-ENV:Envelope xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
      xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ZSI="http://www.zolera.com/schemas/ZSI/"
      xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><SOAP-ENV:Header></SOAP-ENV:Header><SOAP-ENV:Body
      xmlns:ns1="urn:vim25"><ns1:CurrentTime xsi:type="ns1:CurrentTimeRequestType">  <ns1:_this
      type="ServiceInstance">ServiceInstance</ns1:_this></ns1:CurrentTime></SOAP-ENV:Body></SOAP-ENV:Envelope>
    headers:
      Accept:
      - '*/*'
      Accept-Encoding:
      - gzip, deflate
      Connection:
      - keep-alive
      Content-Length:
      - '643'
      Content-Type:
      - text/xml; charset="utf-8"
      Cookie:
      - 'vmware_soap_session="a2055e6c86ab55426744a7b9742cde6cc5ce1136"; Path=/; HttpOnly;
        Secure; '
      SOAPAction:
      - urn:vim25/5.0
      User-Agent:
      - Checkmk special agent vsphere
    method: POST
    uri: https://10.39.16.1/sdk
  response:
    body:
      string: "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<soapenv:Envelope xmlns:soapenc=\"\
        http://schemas.xmlsoap.org/soap/encoding/\"\n xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\"\
        \n xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"\n xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\
        >\n<soapenv:Body>\n<CurrentTimeResponse xmlns=\"urn:vim25\"><returnval>2021-12-21T08:25:47.253441Z</returnval></CurrentTimeResponse>\n\
        </soapenv:Body>\n</soapenv:Envelope>"
    headers:
      Cache-Control:
      - no-cache
      Connection:
      - Keep-Alive
      Content-Length:
      - '438'
      Content-Type:
      - text/xml; charset=utf-8
      Date:
      - Tue, 21 Dec 2021 08:25:47 GMT
      X-Frame-Options:
      - DENY
    status:
      code: 200
      message: OK

Bei dem Request, welchen ich oben angehängt habe, scheint er dann ewig auf eine Antwort zu warten. Diese kommt allerdings erst 3 Minuten später.

Hallo @Flolo,

ich kenne das von früher, dass die counters (performance Werte) vom ESXi/vSphere die Abfrage sehr deutlich beeinflussen können. Du könntest mal gegenchecken, ob du diese Probleme weiterhin hast, wenn du die counters generell deaktivierst. Ich glaube, da gab es in irgendeiner Anleitung auch mal einen Hinweis dazu.

Der Fehler ist ja im Endeffekt “ganz trivial”. Der Special Agent hängt bei der Abfrage des Zeitstatus des ESX Servers.

    SYSTEMTIME = (
        '<ns1:CurrentTime xsi:type="ns1:CurrentTimeRequestType">'
        '  <ns1:_this type="ServiceInstance">ServiceInstance</ns1:_this>'
        '</ns1:CurrentTime>'
    )

Die Query wird ausgeführt und der hängt da. Der Server hat keine gültige NTP Config. Bitte mal korrigieren und nochmal probieren.

Die CMK Instanz und der ESX haben den identischen Zeitserver und melden mir auch beide eine erfolgreiche NTP Synchronisierung…

Seit deaktivieren der Performancedaten hab ich jetzt keine Aussetzer mehr feststellen können.
Da müsste ich also jetzt abwarten, ob das jetzt auch so bleibt.

Ich würde hier auf ein Problem am ESXi Host tippen, das von dir beschriebene Verhalten unterstützt das. Wenn keine Antwort kommt, kann der Agent auch nichts machen.
Aber gut, wenn du einen Workaround gefunden hast, der dich zumindest in die Lage versetzt zu monitoren.

Bisher habe ich noch keine weiteren Timeouts feststellen können. Ich werde jetzt die Performancedaten erstmal deaktiviert lassen.

Wieso er da allerdings Probleme hat, die Zeit abzufragen, kann ich mir leider nicht ganz erklären…

Freut mich, wenn es so funktioniert. Ich tippe aber weiterhin auf ein Problem auf ESXi-Seite. Vielleicht findest du ja noch heraus, wo das Problem liegt. :slight_smile:

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.