Mk_oracle "Timed out plugin" for all CDBs starting with 2.4.0

Werk 16813 was implemented in a way so that mk_oracle can no longer create an additional process that lives longer than the mk_oracle process itself.
mk_oracle implemented its own async sections: the main process would create another process that does the slow work and would exit relatively fast, the slow process would then do the async sections. After werk #16813 this is no longer possible, as all sub-processes are killed when the main process exits.

Can you try this out this change?

> OMD[mysite]:~$ diff share/check_mk/agents/plugins/mk_oracle ~/local/share/check_mk/agents/plugins/mk_oracle
> <             cat <<HERE | nohup "${THIS_SHELL}" >/dev/null 2>&1 &
> ---
> >             cat <<HERE | nohup "${THIS_SHELL}" >/dev/null 2>&1

The local copy of the mk_oracle has to be used if you are using the agent bakery to roll out your plugins otherwise you can do the changes manually.

Hello,

I changed the line in /usr/lib/check_mk_agent/plugins/60/mk_oracle as follows:

cat <<HERE | nohup "${THIS_SHELL}" >/dev/null 2>&1

I deleted the CDB database cache files here:
/var/lib/check_mk_agent/cache.

Unfortunately, the error occurred again after 20 minutes.

-rw-r–r-- 1 root root 73 Jun 17 10:39 oracle_PRODZENT1.cache.fail
-rw-r–r-- 1 root root 0 Jun 17 10:39 oracle_PRODZENT1.cache.new.35034

Regards,
Michael

Are you using systemd and have you set this ?
image

We are using systemd, xinetd is not installed. We use the legacy agent und do not use the agent bakery.

We are using systemd, xinetd is not installed

As per the screenshot, you should put the agent plugin under the folder “60”. Is it already the case?

Yes

/usr/lib/check_mk_agent/plugins/60
[root@prod-zentrale1 60]# ll
total 136
-rwxrwxr-x 1 root root 138209 Jun 17 09:29 mk_oracle
[root@prod-zentrale1 60]# 

Thanks for confirming. Could you increase that interval from 60 to 300 and see if this brings any improvement ?

Yes of cource. I set the interval to 300 now

Unfortunately, increasing the parameter to 300 didn’t help.

-rw-r–r-- 1 root root 74 Jun 17 13:27 oracle_PRODZENT1.cache.fail
-rw-r–r-- 1 root root 0 Jun 17 13:27 oracle_PRODZENT1.cache.new.22974

Hello,

I’ve reverted all changes, as this caused some SYNC checks to go into stale mode and trigger a notification. I’m happy to assist with further testing.

Regards,
Michael

1 Like

Yes, and these are working

Question: Is this supposed to be fixed? Because we encountered that problem after the update from 2.3p33 to 2.4p8?

1 Like

The fix is already ready.We have tested it successfully with different customers. It will be part of the next patch release. Official werk will be available next week.

I see, thank you. But when will that patch release will be released? The update crashed the Oracle monitoring for a big environment. We already had the trouble with the Windows Agent Updates there which was fixed in p8…

As I said next week.
If you need an MKP please open a ticket.
This problem is only for mk_oracle running on Linux systems.

can you confirm Solaris and AIX is not affected?

Here is the werk: Werk #18472: Restore async/cached agent plugins . Sorry, p9 will happen next week.

1 Like