At the moment we’re having issues with exactly one Windows server & the agent running on it. It is running, accepts the connection, but doesn’t return any data.
The agent’s check_mk.log file contains the following error at the end:
2022-07-22 14:02:46.615 [srv 9128] [ERROR:CRITICAL] Memory usage is too high [233684992]
The agent did work at one point but hasn’t for several days now.
My Google fu is extremely weak wrt. to this particular error, one I’ve never seen before either.
Any ideas?
CMK version: 2.1.0p6.cee OS version: Windows Server 2019
Or at least restart the check_mk agent service? Maybe you have an unnoticed memory leak in it and restarting the service may do the trick. If all else fails, it’s Windows; reboot it.
Hmmm, now I’m no Windows expert, but I found this ‘showmemusage.bat’ file somewhere
@echo off
REM Name: showmemusage.bat
REM Created: November 25, 2014
REM Last Modified: November 25, 2014
REM Version: 1.0
REM
REM Description: calculates the total amount of memory used by all instances
REM of a specified process name. The name can be placed on the command line,
REM e.g., "showmemusage chrome.exe". If none is included on the command line,
REM i.e., the batch file is run with "showmemusage", it will prompt for a
REM process name. The total memory usage derived from the tasklist command
REM will be displayed in kilobytes (KB). See
REM http://support.moonpoint.com/os/windows/commands/batch/showmemusage.php
setlocal EnableDelayedExpansion
set total=0
REM If no process name was entered on the command line, prompt for the process
REM name.
IF [%1]==[] (
set /p pname="Process name: "
) ELSE (
set pname=%1
)
for /f "tokens=5" %%i in ('tasklist /fi "imagename eq %pname%" ^| findstr " K$"
') do (
set pmemuse=%%i
REM eliminate the comma from the number
set pmemuse=!pmemuse:,=!
set /a total=!total! + !pmemuse!
)
echo %total% K
Maybe it can help you determine the memory use of the check_mk_agent according to Windows. See if the agent is actually using too much memory.
Or perhaps, if possible, try another version of the agent. You mentioned that it worked before. Has something recently changed on the server that may cause this issue?
It’s easy to see in TaskManager itself that the check_mk_agent.exe process balloons to > 260 MB almost instantly as soon as the check starts. So yeah, the memory usage is very real.
On different servers memory usage of check_mk_agent.exe stays almost constant around 17 to 18 MB.
Question is, where do those spikes come from? Sounds to me like the agent’s reading a huge amount of data into memory, data that’s only present on this one particular server but not on others. The agent is the same one as the other Windows servers use; no special rules apply to this one particular problematic one.
Then the agent config of this machine might be relevant to solve the problem.
Is it possible that on this machine other plugins are used then on the other ones?
The same agent is used by 184 different Windows hosts. We do not deploy any custom CheckMK configuration to those hosts, we do not modify any of the CheckMK agent’s configuration files on the host manually. All but the affected server work just fine.
Of course all of those hosts are very different and do different things: Windows server versions all over the place, some are AD DCs, others are application servers hosting MS SQL, others run web stuff via IIS.
The problematic server in question is used mainly as a print server. Standard Windows Server 2019 print server setup in a standard Windows AD domain.
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.