Latency on version 2.x

atulchadha24 · January 13, 2022, 5:36pm

CMK version: 2.0.0p11 (CRE)
OS version: Centos 7.9
We have setup a cluster with distributed monitoring central site, which connects to about 14 other sites.

Total host count is 2000 and services 86K, we are seeing unusual latency while searching / opening any host or services.

Are there any recommendations for such setup to reduce the latency and performance tuning.

andreas-doehler · January 13, 2022, 7:01pm

It is possible that the problem mentioned here is also your problem.

The fix in the thread is very simple to implement i think and the fix only needs to be done on the master.

atulchadha24 · January 14, 2022, 8:28am

Trying it now! Thank you

atulchadha24 · January 14, 2022, 9:45am

@andreas-doehler I tried fiddling around with TLS encrypt however didnt see much of difference. I have tried to capture the pages taking time to load

/check_mk/view.py?_show_filter_form=0&filled_in=filter&host=XXXXXXX&view_name=host
/check_mk/sidebar_snapin.py?names=tactical_overview&since=1642149172&_ajaxid=1642152170
/check_mk/view.py?filled_in=filter&host=XXXX&view_name=host&_display_options=htbfcoderuw&_do_actions=&_ajaxid=1642152337

They seems to have upto 20 sec of “waiting” time. Any idea how this can be reduced, the hardare is a VM with 24 CPU / 32 G RAM.

Also, what do you recommend for swapping ?

andreas-doehler · January 14, 2022, 11:28am

If it is waiting then this means the remote site where your host data is, is not responding quickly.
You can test if this happens for hosts from all your sites.

Swapping is bad on all server systems

atulchadha24 · January 14, 2022, 12:24pm

Good that we are thinking same for swapping part, i have tested it from local site too. Seems to be more or less same results

Will check on other sites and get back to you

atulchadha24 · January 14, 2022, 1:38pm

@andreas-doehler This seems to be common to multiple sites, i see high disk usage for the process rrdcached

We had this site created from a backup 1.4 and upgraded to version 2.x

rrdcached seems to be working async writing to disk, we are using drbd to create a HA for the site which also may be contributing to some latency.

Do you recommend changing the rrdcached params from below setting

andreas-doehler · January 14, 2022, 7:33pm

That’s normal. The name says already what it does, it caches the writes for RRD data and write out all data after a specific time or if requested.

This behavior should not impact your slow response problem.

If you access the web interface on a single slave, then there must be no latency problem.

system · January 14, 2023, 7:34pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.