Hi list
after updating my Ubuntu VMs today to the newwest update i encouter massive livestatus timeout errors.
At first i thought this might have been the Checkmk update from p16 to p17 but it seems not.
I run a 4 VM setup, all Ubuntu 18.04LTS with a distributed setup. After updating to 4.15.0-115-generic #116-Ubuntu SMP Wed Aug 26 14:04:49 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
(thats what uname -a says) today the biggest two of the slaves encounter much livestatus timeouts. The livestatus.log on the master reads lots of [cmk.liveproxyd.(14909).Site(<lavename>).Client(17)] Cannot send error message to client: [Errno 9] Bad file descriptor
and also often [Livestatus error: (‘_ssl.c:711: The handshake operation timed out’,).
] similar to [Livestatus error: ('_ssl.c:711: The handshake operation timed out',). The encryption settings are probably wrong."].
Also when i log onto the slaves directly i get empty dashboards with
Cannot connect to 'unix:/omd/sites/INFMON01_2/tmp/run/live': [Errno 11] Resource temporarily unavailable
After fiddling around with downgrade to p16/update o p17 and vice versa and stuff i came up to disable the slaves’ TLS encryption for the connection and changing the livestatus proxy settings a little and enabling ht setting “Use persistent connection” in the slaves’ connection setup.
Still i get a lot of these on the master:
2020-09-02 10:24:05,932 [40] [cmk.liveproxyd.(544).Site(<sitename>).Client(22)] Cannot send error message to client: [Errno 32] Broken pipe 2020-09-02 10:24:37,675 [40] [cmk.liveproxyd.(544).Site(<sitename>).Client(27)] Cannot send error message to client: [Errno 32] Broken pipe 2020-09-02 10:24:37,786 [40] [cmk.liveproxyd.(544).Site(<sitename>).Thread(Thread-12).Channel(12)] Channel failed Traceback (most recent call last): File "/omd/sites/INFMON01/lib/python/cmk/cee/liveproxy/Channel.py", line 174, in _execute answer = self._get_livestatus_response() File "/omd/sites/INFMON01/lib/python/cmk/cee/liveproxy/Channel.py", line 337, in _get_livestatus_response header = self._receive_data(16, self._site.query_timeout()) File "/omd/sites/INFMON01/lib/python/cmk/cee/liveproxy/Channel.py", line 385, in _receive_data raise Exception("Remote Site Query timeout") Exception: Remote Site Query timeout
BR Thomas