Monitoring Nodes, Deployments and Pods Use Case

CMK version: 2.0.0p16 (CRE)
OS version: Debian GNU/Linux 10 (buster)

Error message: N/A

Output of “cmk --debug -vvn hostname”: N/A

Hi,

I have a use case where I need to monitor K8s nodes and pods and trigger notifications when some values are not within a threshold.

I already have a Prometheus server that provide the metrics that I need and a Prometheus Alert Manager that trigger some alerts, but I would like to use Checkmk to produce the same alerts and then delete the Prometheus Alert Manager, so that I can have only one centralized alerting system.

Some examples of notifications that need to be created by checkmk are the following:

  • Nodes Disk Usage above 80%.

  • Pod memory above 90% for the last 5 minutes.

  • Pod not ready for the last 15 minutes.

  • Deployments replicas available don’t match Deployment specification replicas for more than 15 minutes.

When reading the Checkmk documentation I found two possible options:

  1. Use Prometheus Special Agent and PromQL

This seemed at first a good option, but then I noticed that it only supports one value per query.
This doesn’t seem to be maintainable since checkmk needs to identify and monitor when new nodes, deployments or pods are created dynamically.

  1. Use Kubernetes Special Agent (KSA) and Dynamic Host Configuration (DHC)

This would monitor what I need, but it seems that only the paid Entreprise Edition allows to use DHC.

So my questions are:

1 - Can all the alerts above be generated in checkmk?

2 - Is it possible with the free version?

3 - Should be done with KSA and DHC or in another way?

Any help would be very appreciated.

I also didn’t find any documentation talking about if it is possible to monitor the PVs and PVCs usage using the Enterprise Edition.

I would appreciate if someone could answer this also.

Hallo,
the next release 2.1 is announced with a new kubernetes support but Im sure it will be a enterprise feature.
You can use the 25 Hosts free version

for first steps.
Ralf

1 Like

Hi,

the new K8s-monitoring is also available in Raw. But you will have to create a script yourself, which replaces the DHC. E.g. Running through the folder of piggyback files, then do the REST-API calls to add them.
Up to you, which path you take.

@danielserrao
The alerts as mentioned by you will be possible with our new monitoring. The beta should start next month to try it out.
PV/PVCs support will not be immediately available, but hopefully included in another release later this year. In the meantime, we will probably provide a backport for users to add PV/PVC monitoring.

2 Likes

Hallo,

great news :wink:
Ralf

Thanks for the answer Martin.

Hi @martin.hirschvogel is there any news regarding PV/PVCs support? We use checkmk to monitor productive kubernetes clusters, so the volumes are also very important to us.

It is planned for development. Our current prioritization for services is

  1. Services
  2. CronJobs
  3. Persistent Volumes/PVC

We can shuffle this around, if you help us with a good concept for PV/PVC (=talk with us a couple of times and show us how you use PV/PVCs in production)

Hi Martin,
thanks for the information. I would like to help to work out a good concept for PV/PVC with you.
I will contact you.