[Release] Checkmk stable release 2.3.0p28

Dear friends of Checkmk,

the new stable release 2.3.0p28 of Checkmk is ready for download.

This stable release ships with 14 changes affecting all editions of Checkmk,
7 changes for the Enterprise editions, 0 Cloud Edition specific and
2 Managed Services Edition specific changes.

Changes in all Checkmk Editions:

BI

  • 17596 FIX: Remove double confirmation when deleting a BI rule…

Checks & agents

  • 17370 Ship python package “oracledb” with omd…
    NOTE: Please refer to the migration notes!
  • 17567 FIX: Fix predictions calculation for predictive levels…
  • 17625 FIX: Windows: Remote Desktop Licenses: Correctly identify ‘Windows Server 2025’
  • 17620 FIX: agent_aws: Do not crash when invalid credentials are given…
  • 17563 FIX: agent_kube: Honor HTTP proxy setting for Cluster Collector again…
  • 17523 FIX: raritan_pdu_plugs: Don’t discover plugs in unknown state…
  • 17047 FIX: systemd_units: Properly exclude ignored services from failed count in systemd summary…

Setup

  • 17496 FIX: Show incomplete list of backups instead of crashing when querying a large number of AWS S3 backups…

User interface

  • 17676 FIX: Display elements of bidirectional graphs in correct order in graph legend…
  • 17474 FIX: Don’t close dropdown of folder choices while moving folder…
  • 17598 FIX: Filter value of “Host has software package” not loaded on view edit…
  • 17677 FIX: Label MinimumOf and MaximumOf graph elements accordingly in legends and mouse hovers…
  • 17006 FIX: View filtering: Preserve necessary filters on “Reset”…

Changes in the Checkmk Enterprise Edition:

Alert handlers

  • 17691 FIX: Match on host and service labels…
    NOTE: Please refer to the migration notes!

Checks & agents

  • 17681 FIX: Fix enabling realtime checks via bakery on Windows…
  • 17675 FIX: Synthetic Monitoring: Improve compatibility of Robotmk scheduler on Windows…
  • 17562 FIX: agent_kube: requests.SSLError raised on connection using self signed certificates…

REST API

  • 17508 FIX: REST-API: change DCD request and response types…
    NOTE: Please refer to the migration notes!

Reporting & availability

  • 17473 FIX: No page break before first subreport…
    NOTE: Please refer to the migration notes!

Site management

  • 17561 FIX: omd update: Unconditionally save omd config…

Changes in the Checkmk Cloud Edition:

NO CHANGES

Changes in the Checkmk Cloud (SaaS):

NO CHANGES

Changes in the Checkmk Managed Services Edition:

BI

  • 14242 FIX: BI configurations viewed on remote sites are no longer broken…

User interface

  • 17690 FIX: Use correct site filter on remote sites for “Number of” painter…

You can download Checkmk from our download page: Download Checkmk for free | Checkmk

List of all changes: Werks

We greatly thank you for using Checkmk and wish you a successful monitoring,

Your Checkmk Team

1 Like

Just deployed this p28 version, having skipped the p27 due to the issues with Systemd reported previously.
However a/the systemd issue as described in Werk #17047: systemd_units: Properly exclude ignored services from failed count in systemd summary still surfaced over here.

Seen is that its complaining on all monitored hosts about a failed service.
These hosts are a mixture of RockyLinux 9.5, Debian, Proxmox, Ubuntu and SuSE Linux Enterprise Server.

When examining a host i did discover a failed service:

systemctl list-units --failed
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION
● oes-telemetry-agent.service loaded failed failed OES Telemetry agent for OES

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

1 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

As a test i disabled it, to see if it would drop the reported crit:

systemctl disable oes-telemetry-agent.service --now

Did a rescan of the host, but the crit was still present.
After that i made sure to reset the status of the service:

 systemctl reset-failed

This cleared the status when re-issueing systemctl list-units --failed.
But even after again having re-scanned the host the crit remained.

So unfortunately i reverted to my pre-upgrade snapshot, basically reverting the upgrade, and will list it as 'Unsuccessful" :frowning:

  • Glowsome
1 Like

Hi Michael,

Thank you for the report!
Looking into it.

Hi Sara,

Any news regarding the issue i reported ?

  • Glowsome

+1 to Glowsome’s issue. I upgraded today from version 2.3.0p15cre to 2.3.0p28.cre and all my servers showed up with critical status (failed services) in the monitoring dashboard. It appears CheckmK stopped honoring the “Setup > Services > discovery rules > Disabled services” rules I had in place to not alert on lm_sensors, ntpdate, etc.
I also have the Azure metric-sourcers service on several systems. While I don’t have a rule in place for it, CheckMK did not alert on the service before the upgrade. Thanks!

Just confirmed that the fix is being tested and we expect it to be released in p29 :+1:

4 Likes

If possible i would like to be included in pre-release tests, to verify the issue is no longer present.
As this issue is now persistent for 2 releases, it becomes sort-of an issue to justify updating over here.

Not to be/sound like a party-pooper or ranter, but i do think that quality control of releases needs more then ‘internal testing’ to ensure proper landing in the field.

I just need some additional validation to justifying upgrades - as there is still a sentiment of “if it works, dont fix it” over here.

  • Glowsome
1 Like

Hello @Glowsome,

First of all, thanks for finding the issue. I agree on the testing part. Here is an MKP, which simulates the fix that will land in p29.
Also, sorry for any inconvenience this has caused and thanks for the help!

Let me know if there are any new issues, or if the current one persists.

Best,
Luka
systemd_units_17763-0.0.1.mkp (8.0 KB)

3 Likes

Hello @lracic,

I have re-run the upgrade to 2.3.0p28 and same effect was seen as reported above.

As soon as the patch -MKP introduced/applied all Service problems that appeared after the upgrade went away again. :+1:

So visually the reported issue is gone, have not looked at other aspects of the monitoring server.

  • Glowsome
1 Like

Thanks for the feedback @Glowsome