GUI stops to show check results

CMK version: 2.1.0p18 CRE
OS version: Debian 11

Output of cmk -nv --detect-plugins=local hostname

OMD[bk]:~$ cmk -nv --detect-plugins=local wmsserver
* FETCHING DATA
[TCPFetcher] Execute data source
[PiggybackFetcher] Execute data source
No piggyback files for 'wmsserver'. Skip processing.
No piggyback files for '192.168.200.90'. Skip processing.
[...]
Plenty_Info: Status 11.02 [6583,6594] This is a local Check, Count: 2.00
Plenty_Info: Status 11.03 [], Count: 0.00
[...]
[agent] Success, execution time 93.0 sec | execution_time=93.020 user_time=0.010 system_time=0.000 children_user_time=0.000 children_system_time=0.000 cmk_time_agent=93.000

Output of cmk --debug -vvn hostname

MD[bk]:~$ cmk --debug -vvn wmsserver
Checkmk version 2.1.0p18
Try license usage history update.
Trying to acquire lock on /omd/sites/bk/var/check_mk/license_usage/next_run
Got lock on /omd/sites/bk/var/check_mk/license_usage/next_run
Trying to acquire lock on /omd/sites/bk/var/check_mk/license_usage/history.json
Got lock on /omd/sites/bk/var/check_mk/license_usage/history.json
Next run time has not been reached yet. Abort.
Releasing lock on /omd/sites/bk/var/check_mk/license_usage/history.json
Released lock on /omd/sites/bk/var/check_mk/license_usage/history.json
Releasing lock on /omd/sites/bk/var/check_mk/license_usage/next_run
Released lock on /omd/sites/bk/var/check_mk/license_usage/next_run
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.TCP
[cpu_tracking] Start [7f96da8d4520]
[TCPFetcher] Fetch with cache settings: DefaultAgentFileCache(wmsserver, base_path=/omd/sites/bk/tmp/check_mk/cache, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Does not exist)
[TCPFetcher] Execute data source
Connecting via TCP to 192.168.200.90:6556 (5.0s timeout)
Detected transport protocol: TransportProtocol.PLAIN (b'<<')
Reading data from agent
Write data to cache file /omd/sites/bk/tmp/check_mk/cache/wmsserver
Trying to acquire lock on /omd/sites/bk/tmp/check_mk/cache/wmsserver
Got lock on /omd/sites/bk/tmp/check_mk/cache/wmsserver
Releasing lock on /omd/sites/bk/tmp/check_mk/cache/wmsserver
Released lock on /omd/sites/bk/tmp/check_mk/cache/wmsserver
Closing TCP connection to 192.168.200.90:6556
[cpu_tracking] Stop [7f96da8d4520 - Snapshot(process=posix.times_result(user=0.0, system=0.009999999999999981, children_user=0.0, children_system=0.0, elapsed=87.09999999776483))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7f96da8cd4c0]
[PiggybackFetcher] Fetch with cache settings: NoCache(wmsserver, base_path=/omd/sites/bk/tmp/check_mk/data_source_cache/piggyback, max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=True, use_outdated=False, simulation=False)
Not using cache (Cache usage disabled)
[PiggybackFetcher] Execute data source
No piggyback files for 'wmsserver'. Skip processing.
No piggyback files for '192.168.200.90'. Skip processing.
Not using cache (Cache usage disabled)
[cpu_tracking] Stop [7f96da8cd4c0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
+ PARSE FETCHER RESULTS
  Source: SourceType.HOST/FetcherType.TCP
<<<check_mk>>> / Transition NOOPParser -> HostSectionParser
<<<labels:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<df>>> / Transition HostSectionParser -> HostSectionParser
<<<df>>> / Transition HostSectionParser -> HostSectionParser
<<<systemd_units>>> / Transition HostSectionParser -> HostSectionParser
<<<nfsmounts>>> / Transition HostSectionParser -> HostSectionParser
<<<cifsmounts>>> / Transition HostSectionParser -> HostSectionParser
<<<mounts>>> / Transition HostSectionParser -> HostSectionParser
<<<ps_lnx>>> / Transition HostSectionParser -> HostSectionParser
<<<mem>>> / Transition HostSectionParser -> HostSectionParser
<<<cpu>>> / Transition HostSectionParser -> HostSectionParser
<<<uptime>>> / Transition HostSectionParser -> HostSectionParser
<<<lnx_if>>> / Transition HostSectionParser -> HostSectionParser
<<<lnx_if:sep(58)>>> / Transition HostSectionParser -> HostSectionParser
<<<tcp_conn_stats>>> / Transition HostSectionParser -> HostSectionParser
<<<diskstat>>> / Transition HostSectionParser -> HostSectionParser
<<<kernel>>> / Transition HostSectionParser -> HostSectionParser
<<<md>>> / Transition HostSectionParser -> HostSectionParser
<<<vbox_guest>>> / Transition HostSectionParser -> HostSectionParser
<<<job>>> / Transition HostSectionParser -> HostSectionParser
<<<local:sep(0)>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_instances>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_sessions>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_stat_database:sep(59)>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_locks:sep(59)>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_query_duration:sep(59)>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_connections:sep(59)>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_stats:sep(59)>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_version:sep(1)>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_conn_time>>> / Transition HostSectionParser -> HostSectionParser
<<<postgres_bloat:sep(59)>>> / Transition HostSectionParser -> HostSectionParser
No persisted sections
  -> Add sections: ['check_mk', 'cifsmounts', 'cpu', 'df', 'diskstat', 'job', 'kernel', 'labels', 'lnx_if', 'local', 'md', 'mem', 'mounts', 'nfsmounts', 'postgres_bloat', 'postgres_conn_time', 'postgres_connections', 'postgres_instances', 'postgres_locks', 'postgres_query_duration', 'postgres_sessions', 'postgres_stat_database', 'postgres_stats', 'postgres_version', 'ps_lnx', 'systemd_units', 'tcp_conn_stats', 'uptime', 'vbox_guest']
  Source: SourceType.HOST/FetcherType.PIGGYBACK
No persisted sections
  -> Add sections: []
Received no piggyback data
Received no piggyback data
[cpu_tracking] Start [7f96da969790]
value store: synchronizing
Trying to acquire lock on /omd/sites/bk/tmp/check_mk/counters/wmsserver
Got lock on /omd/sites/bk/tmp/check_mk/counters/wmsserver
value store: loading from disk
Releasing lock on /omd/sites/bk/tmp/check_mk/counters/wmsserver
Released lock on /omd/sites/bk/tmp/check_mk/counters/wmsserver
CPU load             15 min load: 2.52, 15 min load per core: 0.63 (4 cores)
CPU utilization      Total CPU: 47.38%
Check_MK Agent       Version: 2.0.0p19, OS: linux
Disk IO SUMMARY      PEND - Initializing counters
Filesystem /         51.47% used (25.27 of 49.10 GB), trend: 0.00 B / 24 hours
Interface 2          [ens33], (up), MAC: 00:0C:29:28:2F:0A, Speed: 1 GBit/s
Interface 3          [br-59ed40c04ade], (up), MAC: 02:42:3E:A3:A5:96, Speed: unknown
Kernel Performance   Process Creations: 0.00/s, Context Switches: 0.00/s, Major Page Faults: 0.00/s, Page Swap in: 0.00/s, Page Swap Out: 0.00/s
Memory               Total virtual memory: 24.55% - 1.41 GB of 5.75 GB, 9 additional details available
Mount options of /   Mount options exactly as expected
Number of threads    295, Usage: 0.96%

[...]
Plenty_Info: Status 11.02 [6583,6594] This is a local Check, Count: 2.00
Plenty_Info: Status 11.03 [], Count: 0.00
[...]

PostgreSQL ANALYZE MAIN/postgres No never checked tables
PostgreSQL ANALYZE MAIN/sscwmsdb Table: td_stocktaking_record_20220108, Not analyzed for: 1 year 137 days, No never checked tables
PostgreSQL Bloat MAIN/postgres Maximum table bloat at pg_amproc: 0.25%, Maximum wasted tablespace at pg_collation: 16.00 kB, Maximum index bloat at pg_amproc: 0.5%, Maximum wasted indexspace at pg_depend: 96.00 kB, Summary of top 10 wasted tablespace: 120.00 kB, Summary of top 10 wasted indexspace: 128.00 kB
PostgreSQL Bloat MAIN/sscwmsdb Maximum table bloat at plenty_stock: 0.83%, Maximum wasted tablespace at log_action: 90.02 MB, Maximum index bloat at plenty_stock_movement: 0.76%, Maximum wasted indexspace at plenty_stock_movement: 21.41 MB, Summary of top 10 wasted tablespace: 267.29 MB, Summary of top 10 wasted indexspace: 32.85 MB
PostgreSQL Connection Time MAIN 0.083 seconds
PostgreSQL Connections MAIN/postgres Used active connections: 1, Used active percentage: 0.5%, Used idle connections: 0, Used idle percentage: 0%
PostgreSQL Connections MAIN/sscwmsdb Used active connections: 0, Used active percentage: 0%, Used idle connections: 7, Used idle percentage: 3.5%
PostgreSQL DB MAIN/postgres Size Size is 7.44 MB
PostgreSQL DB MAIN/postgres Statistics Blocks Read: 0.00/s, Fetches: 0.00/s, Commits: 0.00/s, Deletes: 0.00/s, Updates: 0.00/s, Inserts: 0.00/s
PostgreSQL DB MAIN/sscwmsdb Size Size is 2.79 GB
PostgreSQL DB MAIN/sscwmsdb Statistics Blocks Read: 0.00/s, Fetches: 0.00/s, Commits: 0.00/s, Deletes: 0.00/s, Updates: 0.00/s, Inserts: 0.00/s
PostgreSQL DB MAIN/template0 Size Size is 7.44 MB
PostgreSQL DB MAIN/template0 Statistics Blocks Read: 0.00/s, Fetches: 0.00/s, Commits: 0.00/s, Deletes: 0.00/s, Updates: 0.00/s, Inserts: 0.00/s
PostgreSQL DB MAIN/template1 Size Size is 7.44 MB
PostgreSQL DB MAIN/template1 Statistics Blocks Read: 0.00/s, Fetches: 0.00/s, Commits: 0.00/s, Deletes: 0.00/s, Updates: 0.00/s, Inserts: 0.00/s
PostgreSQL Daemon Sessions MAIN Total: 2, Running: 2
PostgreSQL Instance MAIN Status: running with PID 1341
PostgreSQL Locks MAIN/postgres Access Share Locks 1, Exclusive Locks 0
PostgreSQL Locks MAIN/sscwmsdb Access Share Locks 6, Exclusive Locks 0
PostgreSQL Query Duration MAIN/postgres Longest query: 0 seconds, PID: 121035, Query: SELECT datname, datid, usename, client_addr, state AS state, COALESCE(ROUND(EXTRACT(epoch FROM now()-query_start)),0) AS seconds, pid, regexp_replace(query, E'[\n\r\u2028]+', ' ', 'g' ) AS current_query FROM pg_stat_activity WHERE (query_start IS NOT NULL AND (state NOT LIKE 'idle%' OR state IS NULL)) ORDER BY query_start, pid DESC
PostgreSQL Query Duration MAIN/sscwmsdb No queries running
PostgreSQL VACUUM MAIN/postgres No never checked tables
PostgreSQL VACUUM MAIN/sscwmsdb Table: td_stocktaking_record_20220108, Not vacuumed for: 1 year 137 days, No never checked tables
PostgreSQL Version main PostgreSQL 10.23 (Ubuntu 10.23-0ubuntu0.18.04.2) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
Systemd Service Summary Total: 141, Disabled: 5, Failed: 2, 2 services failed (networkd-dispatcher, unattended-upgrades)(!!)
TCP Connections      Established: 27
Uptime               Up since Sep 07 2023 16:34:50, Uptime: 4 days 23 hours
No piggyback files for 'wmsserver'. Skip processing.
No piggyback files for '192.168.200.90'. Skip processing.
[cpu_tracking] Stop [7f96da969790 - Snapshot(process=posix.times_result(user=0.03000000000000025, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.03999999910593033))]
[agent] Success, execution time 87.1 sec | execution_time=87.140 user_time=0.030 system_time=0.010 children_user_time=0.000 children_system_time=0.000 cmk_time_agent=87.090

I’d assume, checks are running the way, they should. As they ever did.
I did scheduled OS update (apt get update / upgrade) and I think it was about this time, monitoring of this host stopped working. I was investigating in this update, but since checks are running perfectly fine, there must be a different problem.

In the servers GUI all hosts look fine but this one:

I got stuck - any hint what to look for are appreciated.

Hi @fho,

by default - the nagios core in checkmk raw edition has a timeout of 60 seconds, your agent needs 87 seconds to respond.

You can check if

a) you move configure some plugins to be run asynchronously to speed up the actual agent contact Monitoring Windows - The new agent for Windows in detail

b) allow nagios to execute service checks for more than 60 seconds => (Service Check Timed Out) for "Check_MK" service - #2 by andreas-doehler

1 Like

that was it - in the first step I went with workaround/solution b:

Go to ~/etc/nagios/nagios.d/ open the tuning.cfg and search for this option service_check_timeout=60. This is the global check timeout for service checks.

did increase to 120 seconds :pray: :+1:

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.