Mk_oracle "Timed out plugin" for all CDBs starting with 2.4.0

chauhan_sudhir · June 16, 2025, 2:16pm

Werk 16813 was implemented in a way so that mk_oracle can no longer create an additional process that lives longer than the mk_oracle process itself.
mk_oracle implemented its own async sections: the main process would create another process that does the slow work and would exit relatively fast, the slow process would then do the async sections. After werk #16813 this is no longer possible, as all sub-processes are killed when the main process exits.

Can you try this out this change?

> OMD[mysite]:~$ diff share/check_mk/agents/plugins/mk_oracle ~/local/share/check_mk/agents/plugins/mk_oracle
> <             cat <<HERE | nohup "${THIS_SHELL}" >/dev/null 2>&1 &
> ---
> >             cat <<HERE | nohup "${THIS_SHELL}" >/dev/null 2>&1

The local copy of the mk_oracle has to be used if you are using the agent bakery to roll out your plugins otherwise you can do the changes manually.

michael_kauschke · June 17, 2025, 8:58am

Hello,

I changed the line in /usr/lib/check_mk_agent/plugins/60/mk_oracle as follows:

cat <<HERE | nohup "${THIS_SHELL}" >/dev/null 2>&1

I deleted the CDB database cache files here:
/var/lib/check_mk_agent/cache.

Unfortunately, the error occurred again after 20 minutes.

-rw-r–r-- 1 root root 73 Jun 17 10:39 oracle_PRODZENT1.cache.fail
-rw-r–r-- 1 root root 0 Jun 17 10:39 oracle_PRODZENT1.cache.new.35034

Regards,
Michael

chauhan_sudhir · June 17, 2025, 9:11am

Are you using systemd and have you set this ?

michael_kauschke · June 17, 2025, 9:40am

We are using systemd, xinetd is not installed. We use the legacy agent und do not use the agent bakery.

chauhan_sudhir · June 17, 2025, 9:44am

We are using systemd, xinetd is not installed

As per the screenshot, you should put the agent plugin under the folder “60”. Is it already the case?

michael_kauschke · June 17, 2025, 9:46am

Yes

/usr/lib/check_mk_agent/plugins/60
[root@prod-zentrale1 60]# ll
total 136
-rwxrwxr-x 1 root root 138209 Jun 17 09:29 mk_oracle
[root@prod-zentrale1 60]#

chauhan_sudhir · June 17, 2025, 10:14am

Thanks for confirming. Could you increase that interval from 60 to 300 and see if this brings any improvement ?

michael_kauschke · June 17, 2025, 10:42am

Yes of cource. I set the interval to 300 now

michael_kauschke · June 17, 2025, 11:25am

Unfortunately, increasing the parameter to 300 didn’t help.

-rw-r–r-- 1 root root 74 Jun 17 13:27 oracle_PRODZENT1.cache.fail
-rw-r–r-- 1 root root 0 Jun 17 13:27 oracle_PRODZENT1.cache.new.22974

michael_kauschke · June 17, 2025, 2:22pm

Hello,

I’ve reverted all changes, as this caused some SYNC checks to go into stale mode and trigger a notification. I’m happy to assist with further testing.

Regards,
Michael

bkuhn · July 24, 2025, 8:34am

Yes, and these are working

bkuhn · July 24, 2025, 8:36am

Question: Is this supposed to be fixed? Because we encountered that problem after the update from 2.3p33 to 2.4p8?

chauhan_sudhir · July 24, 2025, 3:01pm

The fix is already ready.We have tested it successfully with different customers. It will be part of the next patch release. Official werk will be available next week.

bkuhn · July 24, 2025, 3:18pm

I see, thank you. But when will that patch release will be released? The update crashed the Oracle monitoring for a big environment. We already had the trouble with the Windows Agent Updates there which was fixed in p8…

chauhan_sudhir · July 24, 2025, 3:27pm

As I said next week.
If you need an MKP please open a ticket.
This problem is only for mk_oracle running on Linux systems.

foobar · July 25, 2025, 6:24am

can you confirm Solaris and AIX is not affected?

chauhan_sudhir · July 31, 2025, 3:39pm

Here is the werk: Werk #18472: Restore async/cached agent plugins . Sorry, p9 will happen next week.