SSLError "bad handshake" - certificate verify failed

CMK version: 2.1.0p12
OS version: CentOS 7

Recently I updated from 2.0.0p7 → 2.0.0p28 (latest) → 2.1.0p12 (current)

I was aware of the TLS change and registering the nodes since v2.1 … ,
but nevertheless the ship has run aground completely. On all clients I see now

Version: 2.1.0p12, OS: linux, TLS is not activated on monitored host

and communication fails completely.

Our checkmk server uses a Let’s encrypt certificate.

I read the docs meanwhile back and forth, but still I’m missing something.

The last thing I tried was to include the let’s encrypt certificate chain when I bake the agent:
image

Then I copy this rpm file to the target node, install it there like

$ yum install check-mk-agent-2.1.0p12-b6b330ee6d131e0f_withTLS_lnzcheckmk01_chain.noarch.rpm

then I register the node over https:

$ cmk-update-agent register -s lnzcheckmk01.research.silicon-austria.com -i SAL -H salllgpuc05 -U cmkadmin -P '****’ -v -p https
Updated the certificate store “/var/lib/check_mk_agent/cas/all_certs.pem” with 1 certificate(s)
Going to register agent at deployment server
HTTPSConnectionPool(host=‘lnzcheckmk01.research.silicon-austria.com’, port=443): Max retries exceeded with url: /SAL/check_mk/login.py (Caused by SSLError(SSLError(“bad handshake: Error([(‘SSL routines’, ‘tls_process_server_certificate’, ‘certificate verify failed’)])”)))
See syslog or Logfile at /var/lib/check_mk_agent/cmk-update-agent.log for details.

How can I get the ship running again?
I’m running out of ideas what to check and where.

Thanks a lot!

First step - i would not use the full chain inside the deployed agent as this chain is only valid for a maximum of 90 days. Only use the root certificate.

This error message has nothing to do with the agent updater and the registration for updates. It shows the state of the agent communcation itself. Do you get data from the agents at the moment? If yes then only the transport security of the agent data is not configured to use TLS.

2 Likes

thanks for your assistance Andreas.

but my way is correct, is it?
I have to include the lets encrypt certificate and bake the agent, right?

If yes … now I included only the lets encrypt root certificate into the agent, baked it, and deployed it. Same error.

These are my settings:

and

and here is the status of one node:

on others it looks slightly different:

To complete the picture: I’m using also nginx in front of checkmk server where the certificate is kept!

On the second screenshot it shows not the root cert. It is the intermediate.
First screenshot looks ok.
Third screenshot - fetch agent data ok without TLS and agent updater complains about certificate problem from webserver.
I would check on this machine what i see if i do a “openssl s_client -connect monitoringserver:443”

1 Like

you’re right Andreas, doing

$ openssl s_client -connect lnzcheckmk01.research.silicon-austria.com:443

showed me surprisingly an error:

Verification error: unable to verify the first certificate

This is because I was running nginx only with the certificate, but not with the cert-chain (I have no error in the browser). Then I canged this on nginx, and ran the above cmd again → no error anymore :slight_smile:

Now this worked with https:

$ cmk-update-agent register -s lnzcheckmk01.research.silicon-austria.com -i SAL -H hostname -s -U cmkadmin -P '**’ -v -p https
Going to register agent at deployment server
Successfully registered agent of host “salllgpuc05” for deployment.
You can now update your agent by running ‘cmk-update-agent -v’
Saved your registration settings to /etc/cmk-update-agent.state.

then I did:

$ cmk-update-agent -v

±------------------------------------------------------------------+
| |
| Checkmk Agent Updater v2.1.0p12 - Update |
| |
±------------------------------------------------------------------+
Getting target agent configuration for host ‘salllgpuc05’ from deployment server
Target state (from deployment server):

  • Agent Available: True*
  • Signatures: 1*
  • Target Hash: 47156a14c9b6c0c4*
    Agent 47156a14c9b6c0c4 already installed.

$ cmk-agent-ctl status
Version: 2.1.0p12
Agent socket: operational
IP allowlist: any
Legacy mode: enabled
No connections

I have this now for more than one hour. The agent is updating itself every 60min.

And now? At the end, should I bake the agent including the lets encrypt certificate and deploy it, to get full TLS functionality?

I would test this on one specific node at first hand.

thanks…

No the TLS for agent data transport has nothing to do with the agent updater and the webserver HTTPS certificate.
Please have a look at this article.

There are links to all the separate TLS/HTTPS/and so on security measures.

In your case the agent is working fine at the moment with the old encryption if configured.

It only checks every hour if a new agent was build. Update only happens if a new agent is present and signed.

1 Like

I got it!

To be honest, two very similar cmds did not catch my close attention, too bad!

$ cmk-update-agent register
$ cmk-agent-ctl register

Thank you very much to assist me on this way!

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.