CMK version: 2.4.0
OS version: Debian 12 Bookworm
I have upgraded our CMK installation to 2.4.0 this morning and noticed that the Prometheus Alertmanager Agent stopped returning results for our alertmanager rules. Upon further investigation I’ve found a bug in the latest release which seems to cause the issue.
Once the Alertmanager API integration is configured, you can run the check via shell and observe the following output:
cmk --debug -vv --check <hostname>
produces
Expecting value: line 1 column 1 (char 0)
Since the agent is not able to produce its section, all alertmanager checks are marked as stale and the Check_MK service is marked as critical since the agent execution failed with an error.
Analysis
The agent_alertmanager special agent uses the cmk.plugins.lib.prometheus library for various utilities like APISession and the retrieval of the api URL which is queried.
In the APISession class a method is provided to execute API Requests which are sent to a base uri which is in turn specified during the creation of the APISession object.
The Prometheus Agent uses this APISession to perform HTTP Queries against Prometheus. It does so with absolute paths instead of relative paths which are required for the APISession to work as intended. The failing request, which is causing the error above, is sent with /rules instead of /api/v1/rules which returns the graphical rule page from Prometheus that obviously doesn’t represent valid JSON.
The API request is sent from here: checkmk/cmk/plugins/alertmanager/special_agents/agent_alertmanager.py at 3ff176d42b69b8cdd1d1545fa674c0279158d43d · Checkmk/checkmk · GitHub
so what happens under the hood is that the following two strings are passed into urljoin:
urljoin('https://<host>/api/v1', '/rules')
This in turn produces then:
https://<host>/rules
Impact
Currently the alertmanager special agent is practically unusable. A workaround is to map /rules to /api/v1/rules on the Prometheus side, which in turn blocks users from viewing the actual /rules Page of