Err32: Broken Pipe on distributed monitoring after Update to 2.4.0p10

CMK version: 2.4.0p10.cre
OS version: Debian 11: bullseye

Error message:

Hi folks,

we upgraded our monitoring instances to 2.4.0p10 recently. Starting with that, we have seen an increased rate of errors and loading times when access data from our secondary node (distributed monitoring using livestatus).

Err32: Broken Pipe

We have increased the system resources of the second node (it was running at a constant load of 2.7@4 cores, now down to 1.1@8 cores), which slightly eased the situation, but the errors and very long load times still persist.

My question is:

How can I troubleshoot this issue further to find out what exactly is the reason / bottleneck for the connection? I have sadly not found a log file that contains useful information.

Steps I have done until now:

  • Verify the host configs via cmk -vv -O(both instances), ran without findings
  • checked the checkmk forums (found Distributed monitoring issue, "Broken pipe" error, but the livestatus thread setting is not available in raw)
  • the amount of errors increases with more active users

Any help would be greatly appreciated!

Have been having the same issue. RHEL8. raw. Upgraded yesterday from 2.4.0.p10 to p14. That seemed to help for a little.

Only 1 of the 3 nodes has this issue. It’s inconsistent in its appearance, seems to come back when lots of changes via the UI are made. And always with only the one host/node.

Have tried tweaking livestatus xinetd config on that host, no noticeable change.
Did add the status host setup, that seemed to fix things for a while.

I have found tons of errors on the central site earlier today, in the var/log/web.log file. See below. Note that I can generate those Err32: Broken Pipe failures in the web UI, and they don’t get logged; no idea what’s going on there.

Given the above and significant amounts of wcpgw, it seemed pertinent to restart the various apache daemons running. That does seem to have helped, but may be elements of wishful thinking.

2025-10-23 15:41:04,520 [40] [cmk.gui.wsgi.app 1074543] Request finalizing failed with an error while handling an error
Traceback (most recent call last):
  File "/omd/sites/mon01/lib/python3.12/site-packages/flask/app.py", line 1473, in wsgi_app
    response = self.full_dispatch_request()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/site-packages/flask/app.py", line 883, in full_dispatch_request
    return self.finalize_request(rv)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/site-packages/flask/app.py", line 902, in finalize_request
    response = self.make_response(rv)
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/site-packages/flask/app.py", line 1198, in make_response
    rv = self.response_class.force_type(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/site-packages/werkzeug/wrappers/response.py", line 237, in force_type
    response = Response(*run_wsgi_app(response, environ))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/site-packages/werkzeug/test.py", line 1264, in run_wsgi_app
    app_rv = app(environ, start_response)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3/cmk/gui/wsgi/applications/utils.py", line 48, in __call__
    return self.wsgi_app(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3/cmk/gui/wsgi/middleware.py", line 60, in wsgi_app
    return self.app(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3/cmk/gui/wsgi/applications/checkmk.py", line 172, in wsgi_app
    with cmk.ccc.store.cleanup_locks(), sites.cleanup_connections():
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/contextlib.py", line 144, in __exit__
    next(self.gen)
  File "/omd/sites/mon01/lib/python3/cmk/gui/sites.py", line 104, in cleanup_connections
    disconnect()
  File "/omd/sites/mon01/lib/python3/cmk/gui/sites.py", line 119, in disconnect
    g.live.disconnect()
  File "/omd/sites/mon01/lib/python3.12/site-packages/cmk/livestatus_client/__init__.py", line 1162, in disconnect
    connected_site.connection.disconnect()
  File "/omd/sites/mon01/lib/python3.12/site-packages/cmk/livestatus_client/__init__.py", line 670, in disconnect
    self._close_socket()
  File "/omd/sites/mon01/lib/python3.12/site-packages/cmk/livestatus_client/__init__.py", line 676, in _close_socket
    self.socket = self.socket.unwrap()
                  ^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/ssl.py", line 1295, in unwrap
    s = self._sslobj.shutdown()
        ^^^^^^^^^^^^^^^^^^^^^^^
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/omd/sites/mon01/lib/python3.12/site-packages/flask/app.py", line 904, in finalize_request
    response = self.process_response(response)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/site-packages/flask/app.py", line 1281, in process_response
    response = self.ensure_sync(func)(response)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3/cmk/gui/wsgi/blueprints/checkmk.py", line 53, in after_request
    sites.disconnect()
  File "/omd/sites/mon01/lib/python3/cmk/gui/sites.py", line 119, in disconnect
    g.live.disconnect()
  File "/omd/sites/mon01/lib/python3.12/site-packages/cmk/livestatus_client/__init__.py", line 1162, in disconnect
    connected_site.connection.disconnect()
  File "/omd/sites/mon01/lib/python3.12/site-packages/cmk/livestatus_client/__init__.py", line 670, in disconnect
    self._close_socket()
  File "/omd/sites/mon01/lib/python3.12/site-packages/cmk/livestatus_client/__init__.py", line 676, in _close_socket
    self.socket = self.socket.unwrap()
                  ^^^^^^^^^^^^^^^^^^^^
  File "/omd/sites/mon01/lib/python3.12/ssl.py", line 1295, in unwrap
    s = self._sslobj.shutdown()
        ^^^^^^^^^^^^^^^^^^^^^^^
BrokenPipeError: [Errno 32] Broken pipe

Im also getting this error for some of the sites, while updating CMK from 2.3.0p37 to 2.4.0.p11. Is there any update from CMK Support yet?

The solution that seems to work for us: on the central site, restart omd, and restart httpd.

# omd restart
# systemctl restart httpd

Possibly only one of those is needed. Do not need to do anything on the “remote” node that is allegedly timing out. That seems to fix things for a few days to weeks.