Slowness in Multisite

Dear All,

We are having check_mk version 1.5.0p21, configured multiple check_mk slaves.

Configured Status host to tackle the slowness.

Then also facing slowness at multisite level where one or two slaves are not responding.

Please let me know any solution to this slowness issue.


Raw or Enterprise edition?

With CEE you should use liveproxyd:

We are using RAW edition

Is the status host down when the livestatus connection to the remote site is not possible?

We are facing problem when host is UP but not able to reach port 6557 of slave server.

This situation cannot be remedied.

It think it is possible :slight_smile: to resolve this problem.
Create your status host on the main site and don’t use the ping as host check command.
Create a active check to test if port 6557 is available on the slave site. Then use this check result for your host check command.
If you want to extend this a little bit more you can use an active check what really tests if livestatus is answering your query.

I have set host check command to “telnet 6557” of slave servers but UI not responding after this change.

No telnet,
Do it the following way.

  • create a normal status host with ping and one active check - “Check connecting to a TCP port” with port 6557
  • test if this is working
  • now change the host check command from ping to “Use the status of the service …” now insert here the name of your TCP port check like it is in your GUI
  • now check if the host shows down if you livestatus on the slave is stopped
  • if this is working you can use this status host inside your distributed configuration
Yes, I have done in same way. instead of service active check, i mentioned host check to TCP connect 6557 of Slave servers.

anyways I will try your steps as well. I will update you.

Thank you

I have Configured the “TCP Port connect to 6557” of all slaves as active check.
Configured Site configuration, Host check command to active check of respective slave.

Still the response of the multimaster is very slow. (But yes, it is better than earlier)

I have observed, some of slaves are UP and able to Telnet on port 6557 but livestatus query is taking longer time to respond. So Check_mk waits for (30 Sec) Timeout settings of respective slave in site configuration.

Is there any way to improve livestatus query response ??

