Understanding check_mk-cpu.threads

bjerre · October 26, 2020, 7:41am

Dear forum.

We are currently experiencing a Linux host having 11k threads compared to other hosts
Can we count on this check as being the actual server load?

Here is a table of hosts and some information about them
All the VMs are NOT under any meaning full load (max of 30%)
Uptime is between 40 & 200 days

Why would this one server be so overloaded?

Hostname	OS	vCPU	Ram	Thread Count	Running VMs
Host 02	XCP-ng 8.0	80	1536 GB	11300	27
Host 03	XCP-ng 8.0	80	2048 GB	1450	25
Host 04	XCP-ng 8.0	80	2048 GB	1348	20
Host 05	Citrix Xenserver 7.1 CU2	80	1536 GB	1916	22
Host 06	Citrix Xenserver 7.1 CU2	31	512 GB	1865	28

The description of the check reads:

Monitor the number of processes and threads. If too many processes
and threads are found then the check results in a warning or critical
state. The default levels are set to {2000} and {4000}.
Author: Mathias Kettner mk@mathias-kettner.de

andreas-doehler · October 26, 2020, 8:52pm

11k threads with 80 vCPU is not too much i think.
I have here a small 4 core home server with some containers running. Result are 1,2k threads and only a load of 1 and a utilization of 20%.
If i remember it correctly then a bigger Oracle server i have inside one of my monitoring installations has over 20k or 30k threads.

brauliom · February 26, 2021, 7:07pm

Hi guys… How can I remove this Alarm…
CRIT - Count: 37735 threads (warn/crit at 3000 threads/4000 threads)
Just rebooting the Server or there are commands that needs to be executed in order to remove the threads…?

Please let me know if you will need more information.
Thanks in advance. Best Regards.

openmindz · February 26, 2021, 8:01pm

Hi @brauliom and welcome to the forum,

First of all, just the mere fact that an OS has a high number of threads, doesn’t necessarily constitute a problem, as Andreas correctly pointed out in his previous reply. This depends on a multitude of factors, e.g., what’s exactly running on that host and is it capable of handling this number of threads well?

So, your very first course of action should be to determine if the service(s) it provides is(are) negatively affected while this high number of threads is being reported by CMK. If it isn’t, whatever is causing this, may be “normal” and you can safely increase the WARN/CRIT threshold for this check. If the machine is indeed “down to its knees”, you need to find out which process(es) is(are) causing this.

So, in short: Simply rebooting the host, is most likely not a permanent solution, if you cannot find out what is causing this behaviour.

HTH,
Thomas

brauliom · March 1, 2021, 12:16pm

Thank you very much Thomas. Yes. You’re right … If there is a problem caused by these amount of threads I will request a mw to reboot the machine.

openmindz · March 1, 2021, 12:40pm

Hi Braulio,

Umm… that is not what I said, but… you’re welcome.

Thomas

system · March 1, 2022, 12:40pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact @fayepal if you think this should be re-opened.