Getting: Failed to acquire watch file descriptor: Permission denied with newly installed system

CMK version: cee 2.2.0p26
OS version: Rocky Linux 9.4

Error message: May 20 09:53:02 monitoring systemd[6123]: Failed to acquire watch file descriptor: Permission denied

We installed this new host last week and imported a checkmk backup. Everything is running but this error now shows up in the journal and we do not know where that comes from. It seems it has something to do with the monitoring user because the error lays around the monitoring user session as seen below

May 20 09:53:01 monitoring systemd[1]: Created slice User Slice of UID 986.
May 20 09:53:01 monitoring systemd[1]: Starting User Runtime Directory /run/user/986...
May 20 09:53:02 monitoring systemd[1]: Finished User Runtime Directory /run/user/986.
May 20 09:53:02 monitoring systemd[1]: Starting User Manager for UID 986...
May 20 09:53:02 monitoring systemd[6123]: pam_unix(systemd-user:session): session opened for user monitoring(uid=986) by monitoring(uid=0)
May 20 09:53:02 monitoring systemd[6123]: Failed to acquire watch file descriptor: Permission denied
May 20 09:53:02 monitoring systemd[6123]: Queued start job for default target Main User Target.
May 20 09:53:02 monitoring systemd[6123]: Created slice User Application Slice.
May 20 09:53:02 monitoring systemd[6123]: Mark boot as successful after the user session has run 2 minutes was skipped because of an unmet condition check (ConditionUser=!@system).
May 20 09:53:02 monitoring systemd[6123]: Started Daily Cleanup of User's Temporary Directories.
May 20 09:53:02 monitoring systemd[6123]: Reached target Paths.
May 20 09:53:02 monitoring systemd[6123]: Reached target Timers.
May 20 09:53:02 monitoring systemd[6123]: Starting D-Bus User Message Bus Socket...
May 20 09:53:02 monitoring systemd[6123]: Listening on PipeWire PulseAudio.
May 20 09:53:02 monitoring systemd[6123]: Listening on PipeWire Multimedia System Sockets.
May 20 09:53:02 monitoring systemd[6123]: Starting Create User's Volatile Files and Directories...
May 20 09:53:02 monitoring systemd[6123]: Finished Create User's Volatile Files and Directories.
May 20 09:53:02 monitoring systemd[6123]: Listening on D-Bus User Message Bus Socket.
May 20 09:53:02 monitoring systemd[6123]: Reached target Sockets.
May 20 09:53:02 monitoring systemd[6123]: Reached target Basic System.
May 20 09:53:02 monitoring systemd[6123]: Reached target Main User Target.
May 20 09:53:02 monitoring systemd[6123]: Startup finished in 181ms.
May 20 09:53:02 monitoring systemd[1]: Started User Manager for UID 986.
May 20 09:53:02 monitoring systemd[1]: Started Session 26 of User monitoring.
May 20 09:53:02 monitoring systemd[1]: Started Session 27 of User monitoring.

I am having the same error Failed to acquire watch file descriptor: Permission denied in my /var/log/messages file after performing a yum update on a Rocky Linux system. While running on version 9.3 of Rocky Linux, this error is not showing up. Immediately after the upgrade to version 9.4, we’re now seeing this error message in the log files. This is consistent across two servers which were both upgraded from version 9.3 to 9.4.

Not sure where to start looking or how to update permissions for some file descriptor. Any help would be greatly appreciated. :+1:

Hi,
Checkmk uses cron under the hood to execute tasks at regular intervals. When a site is started, Checkmk creates cron jobs for the site user. You can inspect the created cron jobs like this:

OMD[monitoring]:~$ omd status crontab
crontab:        running
-----------------------
Overall state:  running
OMD[monitoring]:~$ crontab -l
#
# Do not edit this file. It will be recreated each time OMD
# is started or reloaded.
#
# execute 'omd reload crontab'
# to rebuild this file out of /omd/sites/monitoring/etc/cron.d/*
#
# --ENVIRONMENT------------------------------------------------
SHELL=/bin/bash
BASH_ENV=/omd/sites/monitoring/.profile
OMD_ROOT=/omd/sites/monitoring
OMD_SITE=monitoring
PATH=/omd/sites/monitoring/local/bin:/omd/sites/monitoring/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin
MAILTO=""
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_bulk_notify
# Needed for bulk notifcations.
# Only execute cmk --notify when the microcore is currently not enabled.
* * * * * [ ! -e /omd/sites/monitoring/etc/check_mk/conf.d/microcore.mk -a -d /omd/sites/monitoring/var/check_mk/notify/bulk ] && cmk --notify send-bulks
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_cleanup_pdf_tmp_files
# Once a day, at 00:15, search for PDF tmp files older than 1 day and delete them
15 0 * * * [ -d "$OMD_ROOT/tmp/check_mk/pdf" ] && find $OMD_ROOT/tmp/check_mk/pdf -maxdepth 1 -name tmp\* -mtime 1 -delete
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_cleanup_piggyback
# Once a day, at 00:10, cleanup outdated piggyback files
10 0 * * * cmk --cleanup-piggyback
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_discovery
# Every 5 minutes autodiscover services marked by discovery check
*/5 * * * * cmk --discover-marked-hosts >/dev/null 2>&1
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_dns_cache
# Once a day, at 00:05, update IP address cache of all Check_MK hosts
5 0 * * * cmk --update-dns-cache
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_inventory
# Every 5 minutes autoinventory hosts marked by inventory check
*/5 * * * * cmk --inventorize-marked-hosts
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_license_email_notification
# Run every day at 00:00
0 0 * * * /omd/sites/monitoring/bin/cmk-license-email-notification
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_multisite
# Run Multisite regular jobs, e.g. scheduled reports
* * * * * . $OMD_ROOT/etc/omd/site.conf ; curl http://localhost:$CONFIG_APACHE_TCP_PORT/monitoring/check_mk/run_cron.py >/dev/null 2>&1
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/cmk_update_license_usage
# Run every 30 minutes between 8am and 5pm
# Note: in update_license_usage() the next (random) time point is set between 8am and 4pm
*/30 8-17 * * * /omd/sites/monitoring/bin/cmk-update-license-usage
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/diskspace
#
# Run every hour, 5 minutes after full hour
#
5 * * * * diskspace >> $OMD_ROOT/var/log/diskspace.log 2>&1
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/logrotate
#
# Daily Logrotate
#
0 0 * * * $OMD_ROOT/bin/logrotate -s $OMD_ROOT/tmp/run/logrotate.state $OMD_ROOT/etc/logrotate.conf >/dev/null 2>&1
# ------------------------------------------------------------
# /omd/sites/monitoring/etc/cron.d/php-sessions
# Once a day, at 00:10, search for PHP sessions that are older than 1 day and delete them
10 0 * * * find $OMD_ROOT/tmp/php/session -name sess_\* -mtime +1 -delete
# ------------------------------------------------------------

When stopping crontab, the created cron jobs are deleted:

OMD[monitoring]:~$ omd stop crontab
Removing Crontab...OK
OMD[monitoring]:~$ crontab -l
no crontab for monitoring

Stopping crontab makes the error go away (do not leave your site in this state though, it won’t be fully functional). This seems to be a non-Checkmk-specific issue of cron jobs of non-admin users on RHEL/Rocky 9.4 systems. It can be reproduced by manually creating cron jobs. First, let’s create a cron job for an admin user (vagrant in my case):

[vagrant@rocky9vm ~]$ cat /crontab_input 
* * * * * echo "abc"
[vagrant@rocky9vm ~]$ cat /crontab_input | crontab -
[vagrant@rocky9vm ~]$ crontab -l
* * * * * echo "abc"

Doing this leads to the following log messages (no error):

Jan 08 10:56:01 rocky9vm systemd[1]: Started Session 4802 of User vagrant.
Jan 08 10:56:01 rocky9vm CROND[487696]: (vagrant) CMD (echo "abc")
Jan 08 10:56:01 rocky9vm CROND[487694]: (vagrant) CMDOUT (abc)
Jan 08 10:56:01 rocky9vm CROND[487694]: (vagrant) CMDEND (echo "abc")
Jan 08 10:56:01 rocky9vm systemd[1]: session-4802.scope: Deactivated successfully.

Now, let’s remove this cron job and create one for a non-admin user (joerg in my case):

[vagrant@rocky9vm ~]$ crontab -r
[joerg@rocky9vm ~]$ cat /crontab_input | crontab -
[joerg@rocky9vm ~]$ crontab -l
* * * * * echo "abc"

Now, we get these log messages, including the error:

Jan 08 10:59:01 rocky9vm systemd[1]: Created slice User Slice of UID 1001.
Jan 08 10:59:01 rocky9vm systemd[1]: Starting User Runtime Directory /run/user/1001...
Jan 08 10:59:01 rocky9vm systemd[1]: Finished User Runtime Directory /run/user/1001.
Jan 08 10:59:01 rocky9vm systemd[1]: Starting User Manager for UID 1001...
Jan 08 10:59:01 rocky9vm systemd[488519]: pam_unix(systemd-user:session): session opened for user joerg(uid=1001) by joerg(uid=0)
Jan 08 10:59:01 rocky9vm systemd[488519]: Failed to acquire watch file descriptor: Permission denied
Jan 08 10:59:01 rocky9vm systemd[488519]: Queued start job for default target Main User Target.
Jan 08 10:59:01 rocky9vm systemd[488519]: Created slice User Application Slice.
Jan 08 10:59:01 rocky9vm systemd[488519]: Started Mark boot as successful after the user session has run 2 minutes.
Jan 08 10:59:01 rocky9vm systemd[488519]: Started Daily Cleanup of User's Temporary Directories.
Jan 08 10:59:01 rocky9vm systemd[488519]: Reached target Paths.
Jan 08 10:59:01 rocky9vm systemd[488519]: Reached target Timers.
Jan 08 10:59:01 rocky9vm systemd[488519]: Starting D-Bus User Message Bus Socket...
Jan 08 10:59:01 rocky9vm systemd[488519]: Starting Create User's Volatile Files and Directories...
Jan 08 10:59:01 rocky9vm systemd[488519]: Finished Create User's Volatile Files and Directories.
Jan 08 10:59:01 rocky9vm systemd[488519]: Listening on D-Bus User Message Bus Socket.
Jan 08 10:59:01 rocky9vm systemd[488519]: Reached target Sockets.
Jan 08 10:59:01 rocky9vm systemd[488519]: Reached target Basic System.
Jan 08 10:59:01 rocky9vm systemd[488519]: Reached target Main User Target.
Jan 08 10:59:01 rocky9vm systemd[488519]: Startup finished in 110ms.
Jan 08 10:59:01 rocky9vm systemd[1]: Started User Manager for UID 1001.
Jan 08 10:59:01 rocky9vm systemd[1]: Started Session 4804 of User joerg.
Jan 08 10:59:01 rocky9vm CROND[488531]: (joerg) CMD (echo "abc")
Jan 08 10:59:01 rocky9vm CROND[488516]: (joerg) CMDOUT (abc)
Jan 08 10:59:01 rocky9vm CROND[488516]: (joerg) CMDEND (echo "abc")
Jan 08 10:59:01 rocky9vm systemd[1]: session-4804.scope: Deactivated successfully.

To conclude, there is nothing be done here from our side, this seems to be an issue with cron jobs of non-admin users on RHEL/Rocky 9.4 systems.

1 Like

Thanks very much @joerg.herbel for your detailed answer, much appreciated.