Hallo,
ich habe ein merkwürdiges Problem in einer VMware Serverfarm.
2 von 5 ESXi Knoten melden sporadisch kurzzeitig einen Fehler und gehen danach wieder in Ruhe.
Wie gesagt 3 Hosts laufen einwandfrei.
2 Hosts machen diese Probleme
VMware SW-Version: DELL-ESXi-700_16324942-A02
Check_MK SW-Version: 1.6.0.p18 Enterprise
Log-File1:
################################################
OMD[ISL_MONITORING]:~$ cmk --debug -vvn myesx01
[cpu_tracking] Start with phase ‘busy’
Check_MK version 1.6.0p18
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/counters/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/counters/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/counters/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/counters/myesx01
Loading autochecks from /omd/sites/ISL_MONITORING/var/check_mk/autochecks/myesx01.mk
- FETCHING DATA
[cpu_tracking] Push phase ‘ds’ (Stack: [‘busy’])
[special_vsphere] No persisted sections loaded
[special_vsphere] Not using cache (Don’t try it)
[special_vsphere] Execute data source
[special_vsphere] Calling external program “/omd/sites/ISL_MONITORING/share/check_mk/agents/special/agent_vsphere -u ‘root’ -s ‘Geheim’ -i hostsystem,virtualmachine,datastore,counters --direct --hostname ‘myesx01’ -P --spaces cut --timeout 60 --host_pwr_display esxhost --no-cert-check ‘10.180.18.81’”
[special_vsphere] ERROR: Agent exited with code 1: Error while processing received data
################################################
Logfile 2:
cpu_tracking] Start with phase ‘busy’
Check_MK version 1.6.0p18
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/counters/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/counters/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/counters/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/counters/myesx01
Loading autochecks from /omd/sites/ISL_MONITORING/var/check_mk/autochecks/myesx01.mk - FETCHING DATA
[cpu_tracking] Push phase ‘ds’ (Stack: [‘busy’])
[special_vsphere] No persisted sections loaded
[special_vsphere] Not using cache (Don’t try it)
[special_vsphere] Execute data source
[special_vsphere] Calling external program “/omd/sites/ISL_MONITORING/share/check_mk/agents/special/agent_vsphere -u ‘root’ -s ‘Geheim’ -i hostsystem,virtualmachine,datastore,counters --direct --hostname ‘myesx01’ -P --spaces cut --timeout 60 --host_pwr_display esxhost --no-cert-check ‘10.180.18.81’”
[special_vsphere] Write data to cache file /omd/sites/ISL_MONITORING/tmp/check_mk/cache/myesx01
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/cache/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/cache/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/cache/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/cache/myesx01
[cpu_tracking] Pop phase ‘ds’ (Stack: [‘busy’, ‘ds’])
Storing piggyback data for: hostwosvaes
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosvaes/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosvaes/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosvaes/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosvaes/myesx01
Storing piggyback data for: hostwoulocalscmk01_old
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwoulocalscmk01_old/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwoulocalscmk01_old/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwoulocalscmk01_old/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwoulocalscmk01_old/myesx01
Storing piggyback data for: hostwouscrspvpn
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwouscrspvpn/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwouscrspvpn/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwouscrspvpn/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwouscrspvpn/myesx01
Storing piggyback data for: hostwossvc01
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwossvc01/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwossvc01/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwossvc01/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwossvc01/myesx01
Storing piggyback data for: kom_localdb01
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/kom_localdb01/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/kom_localdb01/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/kom_localdb01/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/kom_localdb01/myesx01
Storing piggyback data for: hostwosddc01
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosddc01/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosddc01/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosddc01/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosddc01/myesx01
Storing piggyback data for: hostwousvir1
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwousvir1/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwousvir1/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwousvir1/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwousvir1/myesx01
Storing piggyback data for: kom_localvms02
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/kom_localvms02/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/kom_localvms02/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/kom_localvms02/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/kom_localvms02/myesx01
Storing piggyback data for: hostwosvccs
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosvccs/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosvccs/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosvccs/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwosvccs/myesx01
Storing piggyback data for: hostwoustom1
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwoustom1/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwoustom1/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwoustom1/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/hostwoustom1/myesx01
Storing piggyback data for: SIT_Remote
Try aquire lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/SIT_Remote/myesx01
Got lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/SIT_Remote/myesx01
Releasing lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/SIT_Remote/myesx01
Released lock on /omd/sites/ISL_MONITORING/tmp/check_mk/piggyback/SIT_Remote/myesx01
CPU utilization OK - Total CPU: 33.6%, 37.41GHz/111.34GHz, 2 sockets, 18 cores/socket, 72 threads
Disk IO SUMMARY OK - Read: 323 kB/s, Write: 1.35 MB/s, Latency: 10.00 ms, Read operations: 1.01 1/s, Write operations: 7.84 1/s
Filesystem vsanDatastore OK - 4.32% used (3.77 of 87.33 TB), trend: -729.36 GB / 24 hours, Uncommitted: 781.6 GB, Provisioning: 5.19%
Hardware Sensors OK - All sensors are in normal state
HostSystem myesx01 OK - power state: poweredOn
Interface 03 OK - [vmnic10] (up) MAC: 34:80:0D:A0:2B:D6, 10 Gbit/s, In: 0 B/s (0.0%), Out: 0 B/s (0.0%)
Interface 04 OK - [vmnic11] (up) MAC: 34:80:0D:A0:2B:D7, 10 Gbit/s, In: 0 B/s (0.0%), Out: 0 B/s (0.0%)
Interface 05 OK - [vmnic12] (up) MAC: 34:80:0D:A0:2B:70, 10 Gbit/s, In: 0 B/s (0.0%), Out: 0 B/s (0.0%)
Interface 06 OK - [vmnic13] (up) MAC: 34:80:0D:A0:2B:71, 10 Gbit/s, In: 0 B/s (0.0%), Out: 0 B/s (0.0%)
Interface 15 OK - [vmnic4] (up) MAC: 34:80:0D:A0:2B:20, 10 Gbit/s, In: 28 kB/s (0.0%), Out: 69 kB/s (0.0%)
Interface 16 OK - [vmnic5] (up) MAC: 34:80:0D:A0:2B:21, 10 Gbit/s, In: 173 kB/s (0.0%), Out: 26 kB/s (0.0%)
Interface 17 OK - [vmnic6] (up) MAC: 34:80:0D:A0:2B:22, 10 Gbit/s, In: 0 B/s (0.0%), Out: 0 B/s (0.0%)
Interface 18 OK - [vmnic7] (up) MAC: 34:80:0D:A0:2B:23, 10 Gbit/s, In: 0 B/s (0.0%), Out: 0 B/s (0.0%)
Interface 19 OK - [vmnic8] (up) MAC: 34:80:0D:A0:2B:D4, 10 Gbit/s, In: 139 kB/s (0.0%), Out: 219 kB/s (0.0%)
Interface 20 OK - [vmnic9] (up) MAC: 34:80:0D:A0:2B:D5, 10 Gbit/s, In: 303 kB/s (0.0%), Out: 247 kB/s (0.0%)
Maintenance Mode OK - System not in Maintenance mode
Memory used OK - 14% used - 150.91 GB/1023.62 GB
Multipath 3963626330343362383736353030313000000000 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 500056b34276dffd OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c20e57 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c22bb7 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c244e3 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c24fa7 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c25517 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c255a7 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c26d5f OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c28217 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c28643 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c28767 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c2c0ef OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c2cb03 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c2cd57 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c2d02b OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c30d7f OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1c32a07 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1ebc4a7 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1ebe823 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5000c500d1ebe84b OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5002538bc01062e0 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5002538bc01062f0 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5002538bc0106310 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath 5002538bc010d1f0 OK - 1 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Multipath vmhba2:C0:T2:L0 OK - 0 active, 0 dead, 0 disabled, 0 standby, 0 unknown
Object count OK - Virtualmachines: 11, Hostsystems: 1
Overall state OK - Entity state: green, Power state: poweredOn
System Time OK - Offset: - 80.0 ms
Uptime OK - Up since Mon Aug 31 13:27:28 2020 (169d 01:14:57)
VM SIT_Remote OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM kom_localdb01 OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM kom_localvms02 OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM hostwosddc01 OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM hostwossvc01 OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM hostwosvccs OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM hostwouscrspvpn OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM hostwoustom1 OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM hostwousvir1 OK - power state: poweredOn, running on [myesx01.hostwowerksich.local]
VM hostwoulocalscmk01_old WARN - power state: poweredOff(!), defined on [myesx01.hostwowerksich.local] - EXECUTING INVENTORY PLUGINS
Plugins: esx_systeminfo esx_vsphere_hostsystem
[cpu_tracking] End
OK - [special_vsphere] Version: unknown, OS: unknown, execution time 0.9 sec | execution_time=0.899 user_time=0.160 system_time=0.020 children_user_time=0.130 children_system_time=0.010 cmk_time_ds=0.579
[cpu_tracking] Pop phase ‘ds’ (Stack: [‘busy’, ‘ds’])
CRIT - Agent exited with code 1: Error while processing received data
#########################################################