Problem with setup for monitoring Oracle DB

CMK version:2.3.0p12.cce
OS version:Ubuntu 22.04.4 LTS

Error message:ORA-99999

Hi,

I am trying to setup Oracle DB monitoring, and have followed instructions from the documentation Monitoring Oracle databases ..

Anyway, the problem I am facing is that there is no output from mk_oracle plugin.

When running /usr/lib/check_mk_agent/plugins/60/mk_oracle -t -l --no-spool

I get an error in the log file :

2024-08-29 16:13:17 [0] [preliminaries] SIDs                : repo stddd
2024-08-29 16:13:17 [0] [preliminaries] Remote instances    :
2024-08-29 16:13:17 [0] [preliminaries] Piggyback hosts     :
2024-08-29 16:13:17 [0] [preliminaries] OS                  : Linux
2024-08-29 16:13:17 [0] [preliminaries] SYNC_SECTIONS       : instance sessions logswitches undostat recovery_area processes recovery_status longactivesessions dataguard_stats performance locks systemparameter
2024-08-29 16:13:17 [0] [preliminaries] ASYNC_SECTIONS      : tablespaces rman jobs resumable iostats
2024-08-29 16:13:17 [0] [preliminaries] SYNC_ASM_SECTIONS   : instance processes
2024-08-29 16:13:17 [0] [preliminaries] ASYNC_ASM_SECTIONS  : asm_diskgroup
2024-08-29 16:13:17 [0] [preliminaries] CACHE_MAXAGE        : 600
2024-08-29 16:13:17 [0] [preliminaries] ONLY_SIDS           :
2024-08-29 16:13:17 [0] [preliminaries] SKIP_SIDS           :
2024-08-29 16:13:17 [0] [preliminaries] Custom SQLs sections:
2024-08-29 16:13:17 [0] [local] setting up intermediate query result directory '/var/lib/check_mk_agent/tmp/mk_oracle_tmp_dir/tasks/2210460'
2024-08-29 16:13:17 [0] [repo] [local] starting background task '1'
2024-08-29 16:13:17 [0] [repo] [local] max parallel task running, waiting for new slot
2024-08-29 16:13:17 [2] [repo] [set_ora_env] TNS_ADMIN/sqlnet.ora: /etc/check_mk//sqlnet.ora
2024-08-29 16:13:17 [0] [stddd] [local] starting background task '2'
2024-08-29 16:13:17 [0] [stddd] [local] max parallel task running, waiting for new slot
2024-08-29 16:13:17 [2] [stddd] [set_ora_env] ORA-99999 ORACLE_HOME for SID 'stddd' not found or not existing!

Content of /etc/check_mk/mk_oracle.cfg

Syntax:

DBUSER=‘USERNAME:PASSWORD’

DBUSER=‘checkmk:redacted::localhost:1521:’

Any suggestions would be helpful :slightly_smiling_face:

M

Hi

mk_oracle searches in /etc/oratab or /var/opt/oracle/oratab for the SID ‘stddd’ to determine oracle_home for this db. if the oratab file cannot be found or read then this error message appears. That is the first thing I would check.

L

1 Like

Hey @LaSoe thanks for the idea, I checked that file on that location /etc/oratab and all I have there is this entry:

repo:/u01/app/oracle/product/19.0.0/db:N

Can I add this ‘stddd’ SID also here, will I mess something up, since this is a production system? I am not an Oracle DB admin, and we have none currently.

Best regards,
M

I’m not an Oracle expert, but I thought a database is automatically added to the oratab during installation.

It might be possible to add the entry manually (without guarantee). However, it is not advisable to experiment in a production environment. I would recommend consulting the Oracle documentation for more detailed guidance.

Thanks, I tried doing this on test env.

There there was an exact problem, /etc/oratab was empty. After adding the SIDs the plugin displays empty lines.

Content of oratab file:

repo:/u01/app/oracle/product/19.0.0/db:N
db1t:/u01/app/oracle/product/19.0.0/db:N
db2t:/u01/app/oracle/product/19.0.0/db:N

Content of log file /var/lib/check_mk_agent/log/mk_oracle.log:

2024-08-30 12:28:03 [0] [preliminaries] SIDs                : repo db1t db2t
2024-08-30 12:28:03 [0] [preliminaries] Remote instances    :
2024-08-30 12:28:03 [0] [preliminaries] Piggyback hosts     :
2024-08-30 12:28:03 [0] [preliminaries] OS                  : Linux
2024-08-30 12:28:03 [0] [preliminaries] SYNC_SECTIONS       : instance sessions logswitches undostat recovery_area processes recovery_status longactivesessions dataguard_stats performance locks systemparameter
2024-08-30 12:28:03 [0] [preliminaries] ASYNC_SECTIONS      : tablespaces rman jobs resumable iostats
2024-08-30 12:28:03 [0] [preliminaries] SYNC_ASM_SECTIONS   : instance processes
2024-08-30 12:28:03 [0] [preliminaries] ASYNC_ASM_SECTIONS  : asm_diskgroup
2024-08-30 12:28:03 [0] [preliminaries] CACHE_MAXAGE        : 600
2024-08-30 12:28:03 [0] [preliminaries] ONLY_SIDS           :
2024-08-30 12:28:03 [0] [preliminaries] SKIP_SIDS           :
2024-08-30 12:28:03 [0] [preliminaries] Custom SQLs sections:
2024-08-30 12:28:03 [0] [local] setting up intermediate query result directory '/var/lib/check_mk_agent/tmp/mk_oracle_tmp_dir/tasks/3229206'
2024-08-30 12:28:03 [0] [repo] [local] starting background task '1'
2024-08-30 12:28:03 [0] [repo] [local] max parallel task running, waiting for new slot
2024-08-30 12:28:03 [2] [repo] [set_ora_env] TNS_ADMIN/sqlnet.ora: /etc/check_mk//sqlnet.ora
2024-08-30 12:28:03 [0] [db1t] [local] starting background task '2'
2024-08-30 12:28:03 [0] [db1t] [local] max parallel task running, waiting for new slot
2024-08-30 12:28:03 [2] [db1t] [set_ora_env] TNS_ADMIN/sqlnet.ora: /etc/check_mk//sqlnet.ora
2024-08-30 12:28:03 [0] [db2t] [local] starting background task '3'
2024-08-30 12:28:03 [0] [db2t] [local] max parallel task running, waiting for new slot
2024-08-30 12:28:03 [2] [db2t] [set_ora_env] TNS_ADMIN/sqlnet.ora: /etc/check_mk//sqlnet.ora

Checkmk server doesn’t recognize databases. Nothing is displayed.

When I do a telnet test from checkmk server to the Oracle DB server I get only empty lines:

......
btime 1680093526
processes 7426209
procs_running 2
procs_blocked 0
softirq 21450132538 0 3000830448 1 246301279 2946057607 0 5299368 605931914 3890860 1756919173
<<<md>>>
Personalities :
unused devices: <none>
<<<vbox_guest>>>
<<<job>>>
<<<oracle_instance>>>
<<<oracle_sessions>>>
<<<oracle_logswitches>>>
<<<oracle_undostat>>>
<<<oracle_recovery_area>>>
<<<oracle_processes>>>
<<<oracle_recovery_status>>>
<<<oracle_longactivesessions>>>
<<<oracle_dataguard_stats>>>
<<<oracle_performance>>>
<<<oracle_locks>>>
<<<oracle_systemparameter>>>
<<<oracle_tablespaces>>>
<<<oracle_rman>>>
<<<oracle_jobs>>>
<<<oracle_resumable>>>
<<<oracle_iostats>>>
<<<oracle_instance>>>
<<<oracle_processes>>>
<<<oracle_asm_diskgroup>>>
<<<chrony:cached(1725014460,120)>>>

Any further suggestions?

Btw. thanks for the help around oratab.

M

In our setup, we don’t have a sqlnet.ora and tnsnames.ora files in “/etc/check_mk/”.

Maybe you can rename these files and try again. If that doesn’t work, you can also try to configure the TNS_ADMIN in your mk_oracle.cfg (oracle default path: $ORACLE_HOME/network/admin) .

export TNS_ADMIN=/opt/oracle/product/19.5.0.0/network/admin

Another possible reason why it’s working for me is that I’ve added up a fallback for TNS_ADMIN in our mk_oracle:

TNS_ADMIN=${TNS_ADMIN:-$MK_CONFDIR}
if ! test -f "${TNS_ADMIN}/sqlnet.ora"; then
    logging -c "[${sid}] [set_ora_env]" "no sqlnet.ora found in '${TNS_ADMIN}'. checking as next the default location ORACLE_HOME."

    # use oracle default path as fallback
    TNS_ADMIN="${ORACLE_HOME}/network/admin"
    if ! test -f "${TNS_ADMIN}/sqlnet.ora"; then
        logging -c -e "[${sid}] [set_ora_env]" "ORA-99999 sqlnet.ora for SID '${ORACLE_SID}' not found or not existing!"
        exit 1
    fi
fi
2 Likes

As @LaSoe said in his last comment, you can try setting the TNS_ADMIN depending on the path where it is located and then perfrom a connection test using “–no-spool -t”

1 Like

Hi,

Thanks all for the help. The solution definitely was:

  1. adding missing db instances to the oratab file in /etc/oratab
  2. adding export TNS_ADMIN=/opt/oracle/product/19.5.0.0/network/admin to the /etc/check_mk/mk_oracle.cfg file

After these two edits, the database instances appeared in Checkmk.

This fix would have made the “export TNS_ADMIN” used by marquis unnecessary:

https://ideas.checkmk.com/suggestions/321109/mk_oracle-search-for-tnsnamesora-also-in-the-default-oracle_homenetworkadmin-dir

1 Like

The file is normaly changed by the DBCA (Database Creation Assistant).
If the entry is missing, the database has been renamed, copied from anaother system or was created on a different way - not with DBCA.

Normaly the file could be changed after the installation.
DBCA will fail during database creation, when an entry for the new database is existing in the oratab.

Make sure to edit the file with:
<ORACLE_SID>:<ORACLE_HOME>:N

The ‘N’ at the end is for ‘no autostart’ of database during boot, when startup scripts from Oracle are in use. A ‘Y’ enables the autostart.

Be careful with the ORACLE_HOME.
If an Instance is started, the environment variable for ORACLE_HOME + ORACLE_SID forms a hash value for the address of the shared memory segment for the instance.

The address is different, with or without a ‘/’ at the end!

Some notes about the TNS_ADMIN variable in mk_oracle.
I made the design in that way, because the default behavior from Oracle is exactly the same. Debugging connections issues with Oracle Client directly will be harder, when mk_oracle has a different logic to find the tnsnames.ora for the connection.
This is really important when working with wallets, because that has a high potential for debugging with the Oracle Client.

Setting TNS_ADMIN on a fixed ORACLE_HOME could be dangerous, because that is a huge difference to the code in mk_oracle.
The plugin searchs for instances and gets the ORACLE_HOME per Instance, to form the TNS_ADMIN. We search for sqlnet.ora inside of the ORACLE_HOME for the instance!

This is fine, when only 1 instance is running on the system but it could be different with Instances with distinct ORACLE_HOMEs and a fixed TNS_ADMIN pointing to 1 of the instances.

Multiple ORACLE_HOMEs are common these days. Some customers are patching the database with changing the value for ORACLE_HOME. If someone removes the old ORACLE_HOME the monitoring is broken, because the path from TNS_ADMIN is invalid - that’s why I search with $ORACLE_HOME/network/admin from the running instance instead a fixed value.

Kind Regards
Thorsten

Hey, @Rendanic thanks for the follow up. Yes, the database was copied from an older version then the DB admin did the restore when we were migrating from the older version of Oracle DB.

Just a quick question. Since those entries didn’t exist in oratab file does this mean that those databases couldn’t auto start after the server restart or there is another mechanism in play?

The Database software installation of Oracle or the DBCA do not install script for autostart of Instances on a server. root.sh changes some files on the system but do not install autostart scripts.
You have to install them manually.

The state ‘Y’ is needed for the examplescripts


This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.