CMK version:
Checkmk Raw Edition 2.2.0p14
OS version Server:
Red Hat Enterprise Linux release 9.3 (Plow)
OS version Agent:
Ubuntu 22.04.3 LTS
Error message:
From the cmk server I get the following error after activating TLS on the agent host:
[agent] Communication failed: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:2580)CRIT, [piggyback] Success (but no data found for this host), Missing monitoring data for all pluginsWARN, execution time 0.0 sec
On the agent, from sudo -u cmk-agent cmk-agent-ctl -vvv daemon
INFO [cmk_agent_ctl] starting
INFO [cmk_agent_ctl] Loaded config from '"/var/lib/cmk-agent/cmk-agent-ctl.toml"', connection registry from '"/var/lib/cmk-agent/registered_connections.json"'
INFO [cmk_agent_ctl::modes::daemon] Could not load pre-configured connections from "/var/lib/cmk-agent/pre_configured_connections.json": No such file or directory (os error 2)
DEBUG [cmk_agent_ctl::misc] Sleeping 28s to avoid DDOSing of sites
INFO [cmk_agent_ctl::modes::pull] Start listening for incoming pull requests
INFO [cmk_agent_ctl::modes::pull] Listening on [::]:6556 for incoming pull connections (IPv6 & IPv4 if activated)
DEBUG [cmk_agent_ctl::misc] Sleeping 12s to avoid DDOSing of sites
INFO [cmk_agent_ctl::modes::pull] [::ffff:10.128.1.148]:49768: Handling pull request.
DEBUG [cmk_agent_ctl::modes::pull] [::ffff:10.128.1.148]:49768: Handling pull request DONE (Task detached).
DEBUG [cmk_agent_ctl::modes::pull] handle_request starts
DEBUG [rustls::server::hs] decided upon suite TLS13_AES_256_GCM_SHA384
WARN [rustls::conn] Sending fatal alert HandshakeFailure
DEBUG [cmk_agent_ctl::modes::renew_certificate] Checking registered connections for certificate expiry.
DEBUG [cmk_agent_ctl::modes::pull] processed task!
WARN [cmk_agent_ctl::modes::pull] [::ffff:10.128.1.148]:49768: Request failed. (invalid peer certificate contents: invalid peer certificate: UnknownIssuer)
On the server in ~site/etc/ssl/sites/
I can verify that the certificate looks ok, and is working properly in over 90 hosts.
$ openssl x509 -in site.pem -noout -text
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
06:bf:a9:f6:bb:08:eb:c7:21:96:02:db:5a:05:ff:4e:47:13:37:b2
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN = Site 'site' local CA
Validity
Not Before: Oct 11 21:38:05 2022 GMT
Not After : Feb 11 21:38:05 3021 GMT
Subject: CN = site
On the agent, looking in /var/lib/cmk-agent/registered_connections.json
, specifically the root_cert
, I can verify that it’s the same across working and this non-working host.
Has anyone else run into this issue?
EDIT: The only difference I see between a good agent output and the bad is the inclusion of IPv6 in the output. E.g., from a good host which does not show any IPv6 in the output:
INFO [cmk_agent_ctl::modes::pull] Listening on 0.0.0.0:6556 for incoming pull connections (IPv4)
INFO [cmk_agent_ctl::modes::pull] 10.128.1.148:44570: Handling pull request.
DEBUG [cmk_agent_ctl::modes::pull] 10.128.1.148:44570: Handling pull request DONE (Task detached).
DEBUG [cmk_agent_ctl::modes::pull] handle_request starts
DEBUG [rustls::server::hs] decided upon suite TLS13_AES_256_GCM_SHA384
DEBUG [cmk_agent_ctl::modes::pull] processed task!
EDIT 2: Disabling IPv6 on the host and bringing up the agent daemon on IPv4 only does not resolve.