CRM monitoring has other result then expected. It doesn’t give critical. Expected was that “The check will report a CRITICAL state when the reported state is not Started. In addition the check can report a problem if a resource is not handled by a specified node.”
Tested with the following
- CheckMK Enterprise 1.6.0p13
- RHEL 7.8
- Pacemaer 1.1.21-4 & corosync 2.4.5-4
- No rules defined for “Heartbeart CRM resource status” and “Heartbeat CRM general status”
Testing of CRM code
So I create a cluster of 3 VM and give the cluster 1 resource.
- Killing pacemaker process doesn’t work. I only result in no information for CheckMK

- Testing it with parameter doesn’t change the status, but gives more information about the cluster (after the fix)

- Disable the resource
pcs resource disable ClusterIP
Resulting in not started resource
![]()
The expectation was that the service check would go to warning of critical (based on the documentation)
Agent output
[root@cleint_vm02 ~]# pcs resource disable ClusterIP
[root@cleint_vm02 ~]# pcs resource
ClusterIP (ocf:IPaddr2): Stopped (disabled)
[root@cleint_vm02 ~]# TZ=UTC crm_mon -1 -r | grep -v ^$ | sed ‘s/^ //; /^\sResource Group:/,$ s/^\s//; s/^\s/_/g’
Stack: corosync
Current DC: cleint_vm03 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum
Last updated: Tue Oct 13 12:58:07 2020
Last change: Tue Oct 13 14:57:36 2020 by root via cibadmin on cleint_vm02
3 nodes configured
1 resource configured (1 DISABLED)
Online: [ cleint_vm01 cleint_vm02 devkladbgl03 ]
Full list of resources:
ClusterIP (ocf:IPaddr2): Stopped (disabled)
Side note for improvement’s to the code
I can also add all the changes as a merge request to GitHub - Checkmk/checkmk: Checkmk - Best-in-class infrastructure & application monitoring if wanted
Problem
The error for Heartbeat CRM general status is
Invalid parameter {‘max_age’: 60, ‘num_resources’: None, ‘num_nodes’: None}: %d format: a number is required, not NoneType
CRM resource status doesn’t give a weard message
Solution
Add extra line with code
elif' '.join(line[1:3]).rstrip('.,').lower() =='resource configured':
Because my ccluster had 1 resource and the code can’t handle the different between resource and resources. There the ouput of the the code on my test setup
[root@client_vmcheck_mk_agent]# TZ=UTC crm_mon -1-r | grep -v ^$ | sed's/^ //; /^\sResource Group:/,$ s/^\s//; s/^\s/_/g'
Stack: corosync
Current DC: client_vm03 (version1.1.20-5.el7_7.2-3c4c782f70) - partition with quorum
Last updated: Tue Oct1309:25:182020
Last change: Thu May2100:02:182020by root via cibadmin on client_vm01
3nodes configured
1resource configured
Online: [client_vm01 client_vm02 client_vm03 ]
Full list of resources:
ClusterIP (ocf::heartbeat:IPaddr2): Started client_vm01




