Hi,
thanks for supporting us in this matter. In my specific test case your proposed change solves the issue of a slow web interface when using TLS encrypted connections to other sites.
In my naive understanding of a timeout value it does not make sense to change it, until the timeout actually hits. Specifying a timeout of 1s seems reasonable in a WAN environment, but should never trigger in our test scenario: running two checkmk instances on a single piece of hardware in the same virtual network, actual network latency < 1ms
. Those VMs are isolated - we do not have network latency issues or connection problems. Following this hypothesis, changing the timeout value does not solve the underlying issue and is not the cause for this either.
I just randomly ran into the analyze configuration feature of checkmk and saw that all sites (even the local site) are marked as critical in terms of Livestatus usage:
CRIT: The current livestatus usage is 100.00% (!!), 20 of 20 connections used (!!), You have a connection overflow rate of 0.00/s (!!)
I restarted all sites several times and due to the nature of the isolated environment I am the only user of the system.
Several questions arose:
- Are this many used connections designed behavior? If so, why is this check red?
- Is this symptom related to this issue?
- Why are there so many used connections?
- What are the connections used for?
- What is the actual cause for so many open(?) connections?
- How may I decrease the number of open connections?
Best,
nh