Problem when changing from mk_postgres to mk_postgres.py

**CMK version:2.0.0p33 CEE
**OS version:RHEL 7.8

**Error message:Some Services dissappear

Output of “cmk --debug -vvn hostname”: (If it is a problem with checks or plugins)

Summary
OMD[xxx]:~$ cmk --debug -vvn xxx.xxx.xxx
Checkmk version 2.0.0p33
Try license usage history update.
Trying to acquire lock on /omd/sites/xxx/var/check_mk/license_usage/next_run
Got lock on /omd/sites/xxx/var/check_mk/license_usage/next_run
Trying to acquire lock on /omd/sites/xxx/var/check_mk/license_usage/history.json
Got lock on /omd/sites/xxx/var/check_mk/license_usage/history.json
Next run time has not been reached yet. Abort.
Releasing lock on /omd/sites/xxx/var/check_mk/license_usage/history.json
Released lock on /omd/sites/xxx/var/check_mk/license_usage/history.json
Releasing lock on /omd/sites/xxx/var/check_mk/license_usage/next_run
Released lock on /omd/sites/xxx/var/check_mk/license_usage/next_run
Loading autochecks from /omd/sites/xxx/var/check_mk/autochecks/xxx.xxx.xxx.mk
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.TCP
[cpu_tracking] Start [7f4fd7451b20]
[TCPFetcher] Fetch with cache settings: DefaultAgentFileCache(base_path=PosixPath('/omd/sites/xxx/tmp/check_mk/cache/xxx.xxx.xxx'), max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
Not using cache (Too old. Age is 4 sec, allowed is 0 sec)
[TCPFetcher] Execute data source
Connecting via TCP to host.ip.address:6556 (5.0s timeout)
Reading data from agent
Output is not encrypted
Write data to cache file /omd/sites/xxx/tmp/check_mk/cache/xxx.xxx.xxx
Trying to acquire lock on /omd/sites/xxx/tmp/check_mk/cache/xxx.xxx.xxx
Got lock on /omd/sites/xxx/tmp/check_mk/cache/xxx.xxx.xxx
Releasing lock on /omd/sites/xxx/tmp/check_mk/cache/xxx.xxx.xxx
Released lock on /omd/sites/xxx/tmp/check_mk/cache/xxx.xxx.xxx
Closing TCP connection to host.ip.address:6556
[cpu_tracking] Stop [7f4fd7451b20 - Snapshot(process=posix.times_result(user=0.010000000000000231, system=0.0, children_user=0.0, children_system=0.0, elapsed=2.0499999970197678))]
  Source: SourceType.HOST/FetcherType.PIGGYBACK
[cpu_tracking] Start [7f4fd748efa0]
[PiggybackFetcher] Fetch with cache settings: NoCache(base_path=PosixPath('/omd/sites/xxx/tmp/check_mk/data_source_cache/piggyback/xxx.xxx.xxx'), max_age=MaxAge(checking=0, discovery=120, inventory=120), disabled=False, use_outdated=False, simulation=False)
[PiggybackFetcher] Execute data source
No piggyback files for 'xxx.xxx.xxx'. Skip processing.
No piggyback files for 'host.ip.address'. Skip processing.
[cpu_tracking] Stop [7f4fd748efa0 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.0))]
[cpu_tracking] Start [7f4fd755b490]
+ PARSE FETCHER RESULTS
  Source: SourceType.HOST/FetcherType.TCP
Trying to acquire lock on /omd/sites/xxx/var/check_mk/persisted/xxx.xxx.xxx
Got lock on /omd/sites/xxx/var/check_mk/persisted/xxx.xxx.xxx
Releasing lock on /omd/sites/xxx/var/check_mk/persisted/xxx.xxx.xxx
Released lock on /omd/sites/xxx/var/check_mk/persisted/xxx.xxx.xxx
Stored persisted sections: lnx_packages, lnx_distro, lnx_cpuinfo, dmidecode, lnx_uname, lnx_video, lnx_ip_r, lnx_sysctl, lnx_block_devices
Using persisted section SectionName('lnx_packages')
Using persisted section SectionName('lnx_distro')
Using persisted section SectionName('lnx_cpuinfo')
Using persisted section SectionName('dmidecode')
Using persisted section SectionName('lnx_uname')
Using persisted section SectionName('lnx_video')
Using persisted section SectionName('lnx_ip_r')
Using persisted section SectionName('lnx_sysctl')
Using persisted section SectionName('lnx_block_devices')
  -> Add sections: ['check_mk', 'chrony', 'cifsmounts', 'cpu', 'df', 'diskstat', 'dmidecode', 'filestats', 'job', 'kernel', 'labels', 'lnx_block_devices', 'lnx_cpuinfo', 'lnx_distro', 'lnx_if', 'lnx_ip_r', 'lnx_packages', 'lnx_sysctl', 'lnx_uname', 'lnx_video', 'local', 'md', 'mem', 'mounts', 'nfsmounts', 'postfix_mailq', 'postfix_mailq_status', 'postgres_connections', 'postgres_instances', 'postgres_locks', 'postgres_query_duration', 'postgres_sessions', 'postgres_stat_database', 'ps_lnx', 'systemd_units', 'tcp_conn_stats', 'uptime', 'vbox_guest']
  Source: SourceType.HOST/FetcherType.PIGGYBACK
No persisted sections loaded
  -> Add sections: []
Received no piggyback data
Loading item states
Trying to acquire lock on /omd/sites/xxx/tmp/check_mk/counters/xxx.xxx.xxx
Got lock on /omd/sites/xxx/tmp/check_mk/counters/xxx.xxx.xxx
Releasing lock on /omd/sites/xxx/tmp/check_mk/counters/xxx.xxx.xxx
Released lock on /omd/sites/xxx/tmp/check_mk/counters/xxx.xxx.xxx
...
PostgreSQL Connections CLUSTER_5432/postgres Used active connections: 1, Used active percentage: 0.5%, Used idle connections: 0, Used idle percentage: 0%
PostgreSQL Connections CLUSTER_5432/routingdatadb No active connections, No idle connections
PostgreSQL DB CLUSTER_5432/postgres Size Size is 8.14 MB
PostgreSQL DB CLUSTER_5432/postgres Statistics Blocks Read: 0.00/s, Fetches: 185.26/s, Commits: 3.23/s, Deletes: 0.00/s, Updates: 0.00/s, Inserts: 0.00/s
PostgreSQL DB CLUSTER_5432/routingdatadb Size Size is 438.16 GB
PostgreSQL DB CLUSTER_5432/routingdatadb Statistics Blocks Read: 0.00/s, Fetches: 32.00/s, Commits: 0.29/s, Deletes: 0.00/s, Updates: 0.00/s, Inserts: 0.00/s
PostgreSQL DB CLUSTER_5432/template1 Size Size is 7.90 MB
PostgreSQL DB CLUSTER_5432/template1 Statistics Blocks Read: 0.00/s, Fetches: 0.00/s, Commits: 0.00/s, Deletes: 0.00/s, Updates: 0.00/s, Inserts: 0.00/s
PostgreSQL Daemon Sessions CLUSTER_5432 Total: 1, Running: 1
PostgreSQL Instance CLUSTER_5432 Status: running with PID 79973
PostgreSQL Locks CLUSTER_5432/postgres Access Share Locks 1, Exclusive Locks 0
PostgreSQL Locks CLUSTER_5432/routingdatadb Access Share Locks 0, Exclusive Locks 0
PostgreSQL Query Duration CLUSTER_5432/postgres Longest query is 0 seconds, Username: pgmoni, PID: 51538, Query: SELECT datname, datid, usename, client_addr, state AS state, COALESCE(ROUND(EXTRACT(epoch FROM now()-query_start)),0) AS seconds, pid, regexp_replace(query, E'[\n\r\u2028]+', ' ', 'g' ) AS current_query FROM pg_stat_activity WHERE (query_start IS NOT NULL AND (state NOT LIKE 'idle%' OR state IS NULL)) ORDER BY query_start, pid DESC
PostgreSQL Query Duration CLUSTER_5432/routingdatadb No query is running
...
+ EXECUTING INVENTORY PLUGINS
...

No piggyback files for 'xxx.xxx.xxx'. Skip processing.
No piggyback files for 'host.ip.address'. Skip processing.
[cpu_tracking] Stop [7f4fd755b490 - Snapshot(process=posix.times_result(user=1.4099999999999997, system=0.14, children_user=0.0, children_system=0.0, elapsed=1.5500000044703484))]
[agent] Version: 2.0.0p33, OS: linux, Allowed IP ranges: xxx/25, Missing monitoring data for check plugins: postgres_bloat, postgres_conn_time, postgres_stats, postgres_version(!), execution time 3.6 sec | execution_time=3.600 user_time=1.420 system_time=0.140 children_user_time=0.000 children_system_time=0.000 cmk_time_agent=2.040

Hello,
currently we are replacing the old shellscript plugin for postgres and switching to the python based one. In this process we also create a dedicated user for checkmk in the DB. This is working fine so far on several hundred systems. But now I have a single system where after replacing the plugin some services suddenly become stale:

postgres_bloat, postgres_conn_time, postgres_stats, postgres_version

I can’t find any differences to the other systems at first sight and I can’t find any way to debug this further. Does anyone have any tips for me?

regards
Christian

Hi Christian.

Did yoz try to debug the output if the plugin? You can try to do this:
Go to root shell at target agent.
Run “export MK_CONFDIR=/etc/check_mk/”
Run "/usr/lib/check_mk_agent/plugins/mk_postgres.py -vvv

Rg. Christian

Hi,
it is not possible to run the plugin this way:

Traceback (most recent call last):
  File "/opt/cmk/lib/plugins/mk_postgres.py", line 1072, in <module>
    main()
  File "/opt/cmk/lib/plugins/mk_postgres.py", line 1067, in main
    postgres.execute_all_queries()
  File "/opt/cmk/lib/plugins/mk_postgres.py", line 264, in execute_all_queries
    version = self.get_server_version()
  File "/opt/cmk/lib/plugins/mk_postgres.py", line 152, in get_server_version
    return float(".".join(version_as_string.split(".")[0:2]))
ValueError: could not convert string to float: '_____'

Hi,
just stumbled across your error with the version split.

For me disabling the bash welcome message of the postgres user fixed it.

Hi Yalion,
Yes you are right. We also found this solution. I just forgot to post it here. Nevertheless thank you so much :blush: