Count, size and age of files - delays windows agent

Hey,

we are suffering from long execution times on windows hosts if we use “Count, size and age of files”. At the moment we are monitoring about 200 Files (PG Backups). As soon as we deploy the file-check the execution time explodes from 2 sec. to about 50 sec.

But that problem occurs not on every windows server. The amount of files is nearly the same:

HostA: 190 files – execution time 1.7 sec

HostB: 207 files - execution time 48.8 sec

They both use the same CMK rule to define what files should be watched. If we disable the file check we are back to 2 sec. execution time.

CMK version: Checkmk Enterprise Edition 2.4.0p21

OS version: SUSE Linux Enterprise Server 15 SP7

Josef

1 Like

Hi Josef,

Issue: The “Count, size and age of files” rule (mk_filestats / fileinfo) causes massive slowdowns on some Windows hosts — agent runtime jumps from ~2 seconds to ~48–50 seconds, while the same rule runs fine on other hosts with similar file counts.

This is a well-known performance problem with the Windows agent when monitoring files.

Main causes:

  • The plugin uses globbing to scan directories. Even with only ~200 matching files, Windows can be very slow if the folder contains thousands of files, is on HDD, heavily fragmented, or scanned by antivirus in real-time.
  • Too broad patterns (e.g. C:\Backups** or recursive search) make it dramatically worse.

Quick recommendations:

  1. Make your include patterns as narrow and specific as possible (e.g. C:\PG_Backups*.backup instead of broad wildcards).
  2. Test the section manually on the slow host:
"C:\Program Files (x86)\checkmk\agent\check_mk_agent.exe" fileinfo
  1. Consider running the rule less frequently or switch to a custom local check (PowerShell) for better performance.
  2. On servers with many files, use aggregation options like count_only where possible.

See also:

Can you share the exact include/exclude patterns you are using in the rule? That usually helps pinpoint the bottleneck.

by the way
3. Consider running the rule less frequently or switch to a custom local check (PowerShell) for better performance.

it`s my preference

Greets Bernd

Hey,

thanks for the detailed reply :slight_smile:

What I don’t understand here is, that I have other windows hosts with nearly the same amount of files and they are working great. Same folders, same patterns.

Josef

Yeah … that is very offten that the behavior is different … why I donknow ... all its just SW with an on life :wink:

Greetz Bernd

hope the mail before has helped

1 Like

The output type you use can have a significant impact on runtime: file_stats lists all matching files, whereas extremes_only and count_only only outputs metrics.

It would also be interesting to determine whether the slowdown occurs during the file list generation (retrieving all filenames matching the pattern) or when reading the file statistics.