It seems I have the same issue as described here: Strange behaviour of Check_MK Service on Cluster Node
I have a proxmox cluster “StarCluster” with two nodes “pve1” and “pve2” of which one is routinely offline (cold standby). I am using the agent on both nodes as well as the proxmox API.
However, there are 1-2 services “Check_MK” (and sometimes “Check_MK discovery”) which make the service on StarCluster go critical:
It seems it wants to connect to the proxmox API on pve1 but it is expected that this fails because pve1 is down!
Yet, the cluster should show up as OK because pve2 is up and all data (agent and API) are coming from there.
I already tried setting up an aggregation rule as which includes “Check_MK” as “Best node wins”:
But no change. Check_MK on StarCluster remains red.
Is there a way to either make it OK or just remove it altogether from the cluster?
(Note: Check_MK does not show up in the service discovery for StarCluster)

