APT Updates: Make check more lenient

So I am using the mk_apt plugin to monitor available APT security updates on my servers, which works great so far. However, I get a lot of false positive notifications. The reason is that the updates get auto-installed once every 24 hours; if the mk_apt check runs when new updates are available, but before they are installed (this is a timespan of a couple of hours), I get a notification which is automatically cleared some times later.
Is it possible to make this check critical e.g. only if it detects the same state twice in a row?

Note that I can’t just increase the timespan between update checks to e.g. two days. That would just mean that the mk_apt might still run at an inconvenient time, and then stay on “critical” for two days.

Basically, what I actually want is to detect if the automatic installation of APT security updates fails or does not take place for whatever reason.

1 Like

You can run touch -d 1970-01-01 /var/lib/check_mk_agent/cache/plugins_mk_apt.cache right after your automatic updates and the agent will update the info immediately.

3 Likes

Thank you! I will run this command right after my cronjob that executes the update command.

FWIW you can also use an apt hook as described here.

2 Likes

As i update my systems via Ansible, the info was very valuable, as it triggered me to implement a task for handling this:

For pushing the APT-solution (2 tasks):

    - name: Check if post-APT-update conf file exists.
      ansible.builtin.stat:
        path: /etc/apt/apt.conf.d/98mk-apt
      register: cmk_apt_stat
    
    - name: Create /etc/apt/apt.conf.d/98mk-apt if it does not exist
      ansible.builtin.copy:
        dest: /etc/apt/apt.conf.d/98mk-apt
        content: |
          # retrigger mk-apt (check_mk)
          DPkg::Post-Invoke {"touch -t 197012181200 /var/lib/check_mk_agent/cache/*mk_apt.cache"; };
      when: cmk_apt_stat.stat.exists == False

For *.RPM-based machines ( using the yum) plugin:

    - name: Touch cachefile of YUM plugin to not wait for the interval
      ansible.builtin.file:
        path: /var/lib/check_mk_agent/cache/plugins_yum.cache
        modification_time: 197012181200.00 

For SuSE -based machines ( using Zypper plugin):

    - name: Touch cachefile of ZYPPER plugin to not wait for the interval
      ansible.builtin.file:
        path: /var/lib/check_mk_agent/cache/plugins_mk_zypper.cache
        modification_time: 197012181200.00
  • Glowsome
1 Like

The easier solution is just to remove the cache file.

Can you give more insight in this ?

I have (manually) removed cache-files for plugins in the past, resulting in weird behavior of the plugin as going stale,dropping off the chart all together, etc , so i am a bit hesitant/cautious to hard-delete cache-files.
Also if something goes wrong, there is no fallback - as the file is/was deleted (destructive), whereas changing the timestamp is non-destructive.

With above suggestion by @Jay2k1 i have not yet to encountered this.
So from my standing this is IMHO a more preferred method of overriding a/the potential cached result.

  • Glowsome

I’ve just deleted all cache files and the next run, the Check_MK service went WARN:
image
The run after, a cached local check went UNKN due to “item not found in monitoring data”, so I think changing the timestamp is the cleanest solution.

However for me personally, it wouldn’t matter much because I have Maximum number of check attempts set to 3 (and some even higher), so these one-time failures would just cause a soft state.

Just an update as to how i experience the apt-implementation as posted above:

- name: Check if post-APT-update conf file exists.
      ansible.builtin.stat:
        path: /etc/apt/apt.conf.d/98mk-apt
      register: cmk_apt_stat
    
    - name: Create /etc/apt/apt.conf.d/98mk-apt if it does not exist
      ansible.builtin.copy:
        dest: /etc/apt/apt.conf.d/98mk-apt
        content: |
          # retrigger mk-apt (check_mk)
          DPkg::Post-Invoke {"touch -t 197012181200 /var/lib/check_mk_agent/cache/*mk_apt.cache"; };
      when: cmk_apt_stat.stat.exists == False

After having watched timestamps get re-aligned it seems/felt like shooting with a shotgun at a mosquito due to the wildcard at /var/lib/check_mk_agent/cache/*mk_apt.cache .

Without hard-setting a/the addition to a host i revised things (as i do not like having hardcoded stuff on my hosts for monitoring):

  • removed the previously set /etc/apt/apt.conf.d/98mk-apt
  • created a new Ansible-task to only reset the timestamp on plugins_mk_apt.cache:

- name: Touch cachefile of APT plugin to not wait for the interval ansible.builtin.file: path: /var/lib/check_mk_agent/cache/plugins_mk_apt.cache modification_time: 197012181200.00

After testing this the same behavior was seen, meaning update of information is processed within a maximum of 2 (default 1-minute) check-cycles, without issues.
This was regardless of (me) having the actual check in a subdir of 3600 to only update once an hour.

  • Glowsome

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.