Trying to register update agent, fails with certificate has expired?

We are using 2.0.0p12 (CEE) with agent bakery, monitoring mostly Windows 2016/2019 servers.

All of my monitored hosts suddenly started reporting issues with automatic updates, specifically in the WATO console I see:

Error: HTTPSConnectionPool(host='nagios.xxxxx.com', port=443): Max retries exceeded with url: /nagios/check_mk/deploy_agent.py (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1125)')))WARN, Last update check: 2021-12-13 09:37:22 (warn at 2 d)WARN, Last agent update: 2021-10-06 08:41:28, Update URL: https://nagios.xxxxx.com/nagios/check_mk, Agent configuration: 70da1a76

The certificate used inside CMK web server (apache2) did in fact expire, and I did renew it. Accessing the WATO interface there is no issues, the certificate is valid with a proper sub/root chain, all healthy.

On the client side (Windows) I try to re-register the agent and no luck.

./check_mk_agent.exe updater register -s nagios.xxxxx.com -i nagios -H xxxxx -P xxxxx -U cmkadmin -v -p https

./check_mk_agent.exe : Updated the certificate store "C:\ProgramData\checkmk\agent\config\cas\all_certs.pem" with 1
certificate(s)
    + CategoryInfo          : NotSpecified: (Updated the cer... certificate(s):String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError

Going to register agent at deployment server
HTTPSConnectionPool(host='nagios.xxxxx.com', port=443): Max retries exceeded with url: /nagios/check_mk/login.py
(Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed:
certificate has expired (_ssl.c:1125)')))

I’ve confirmed that I can curl to the same URL endpoint without certificate issues, the certificate is NOT expired!

curl https://nagios.xxxxx.com -v

* Rebuilt URL to: https://nagios.xxxxx.com/
*   Trying 172.16.4.6...
* TCP_NODELAY set
* Connected to nagios.xxxxx.com (172.16.4.6) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Unknown (8):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Client hello (1):
* TLSv1.3 (OUT), TLS Unknown, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=nagios.xxxxx.com
*  start date: Dec 16 23:37:19 2021 GMT
*  expire date: Dec 15 23:37:19 2024 GMT
*  subjectAltName: host "nagios.xxxxx.com" matched cert's "nagios.xxxxx.com"
*  issuer: DC=com; DC=xxxxx; DC=vand1; CN=xxxxxIntermediateCAv2
*  SSL certificate verify ok.
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
> GET / HTTP/1.1
> Host: nagios.xxxxx.com
> User-Agent: curl/7.58.0
> Accept: */*
> 
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS Unknown, Unknown (23):
< HTTP/1.1 200 OK
< Date: Fri, 17 Dec 2021 00:17:16 GMT
< Server: Apache/2.4.29 (Ubuntu)
< Last-Modified: Sun, 23 Feb 2020 16:59:46 GMT
< ETag: "58-59f412e8fbf5c"
< Accept-Ranges: bytes
< Content-Length: 88
< Vary: Accept-Encoding
< Content-Type: text/html
< 
<meta http-equiv="Refresh" content="0; url=https://nagios.xxxxx.com/nagios/check_mk" />

The root and subca are trusted on the server host, they are inside /usr/local/share/ca-certificates.

openssl s_client -connect nagios.xxxxx.com:443

CONNECTED(00000003)
depth=2 CN = xxxxxRootCAv2
verify error:num=19:self signed certificate in certificate chain
---
Certificate chain
 0 s:/CN=nagios.xxxxx.com
   i:/DC=com/DC=xxxxx/DC=xxxxx/CN=xxxxxIntermediateCAv2
 1 s:/CN=xxxxxRootCAv2
   i:/CN=xxxxxRootCAv2
 2 s:/DC=com/DC=xxxxx/DC=xxxxx/CN=xxxxxIntermediateCAv2
   i:/CN=xxxxxRootCAv2
---
Server certificate
-----BEGIN CERTIFICATE-----
.....
-----END CERTIFICATE-----
subject=/CN=nagios.xxxxx.com
issuer=/DC=com/DC=xxxxx/DC=xxxxx/CN=xxxxxIntermediateCAv2
---
No client certificate CA names sent
Peer signing digest: SHA512
Server Temp Key: ECDH, P-256, 256 bits
---
SSL handshake has read 4986 bytes and written 415 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES256-GCM-SHA384
    Session-ID: 127AA86133A6EFE077411A1AE1EDF0E58F444777465811F2381DFA9654742B77
    Session-ID-ctx: 
    Master-Key: C891EADB6D553B2D745DCB7ABA2DB7186C622E87CE98716D9FE888AFEC1B7C0D432D26E889AD2358F8F4BBF7E86F15DC
    Key-Arg   : None
    Krb5 Principal: None
    PSK identity: None
    PSK identity hint: None
    TLS session ticket lifetime hint: 300 (seconds)
    TLS session ticket: ....
    Start Time: 1639700780
    Timeout   : 300 (sec)
    Verify return code: 19 (self signed certificate in certificate chain)
---
closed

In WATO > Global Settings > Trusted certificate authorities for SSL I have added the SubCA and also made sure “Trust system wide configured CAs” is enabled.

I removed the CMK agent from one host and did a clean MSI install, the error changes slightly when trying to register updater…

Updated the certificate store "C:\ProgramData\checkmk\agent\config\cas\all_certs.pem" with 1 certificate(s)
Going to register agent at deployment server
HTTPSConnectionPool(host='nagios.xxxxx.com', port=443): Max retries exceeded with url: /nagios/check_mk/login.py (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: **self signed certificate in certificate chain** (_ssl.c:1125)')))

Seems like I am dealing with two errors, existing agents fail with:
certificate verify failed: certificate has expired

Re-installing agent with clean option, trying to register fails with…
self signed certificate in certificate chain

I can’t seem to get passed this, stuck for hours. The certificates used are issued by internal PKI.

Anyone able to help me out?

I think I figured it out, there are likely steps I skipped over when renewing certificate chain.

Updating the Agent updater (Linux, Windows, Solaris) WATO rule with our new RootCA, baking the agents, and deploying the new agent .msi on the monitored server as a clean install works without any issues, I can register the updater without SSL errors.

The issue is, all agents deployed on our servers are baked with the old expired certificate and thus we are seeing the certificate verify failed, certificate has expired warning in WATO.

Is there a way to update the baked certificate on all existing hosts WITHOUT uninstalling and re-installing the newly baked agent?

Here is a fix that worked for me, in case anyone else runs into this.

Distribute below files from new baked agent to all agents that were baked with expired certificate.

C:\ProgramData\checkmk\agent\config\cas\all_certs.pem
C:\ProgramData\checkmk\agent\config\cmk-update-agent.cfg

Then re-running check-mk_agent.exe updater -v resolved the warnings for us.

In the case the CA certificate will expire you need to distribute the new CA cert first together with the old one before changing the certificate of the checkmk webserver.

This is a common certificate management task.

It also applies to the checkmk agent updater as it uses its own certificate store.

That was my issue, didn’t distribute new CA before old one expired. Got stuck because the updater would not pull latest baked agents since the old cert expired. Had to update the cert on the agent endpoints manually. It has been certificate chaos for the last few days here… :frowning:

There is an extension available that is able to check certificates in files on Linux. With this check you could get a warning 180 days before your CA certificate expires.

https://exchange.checkmk.com/p/sslcertificates

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.