CMK version: 2.0.0p16 (CRE) OS version: Debian GNU/Linux 10 (buster)
Error message: N/A
Output of “cmk --debug -vvn hostname”: N/A
Hi,
I have a use case where I need to monitor K8s nodes and pods and trigger notifications when some values are not within a threshold.
I already have a Prometheus server that provide the metrics that I need and a Prometheus Alert Manager that trigger some alerts, but I would like to use Checkmk to produce the same alerts and then delete the Prometheus Alert Manager, so that I can have only one centralized alerting system.
Some examples of notifications that need to be created by checkmk are the following:
Nodes Disk Usage above 80%.
Pod memory above 90% for the last 5 minutes.
Pod not ready for the last 15 minutes.
Deployments replicas available don’t match Deployment specification replicas for more than 15 minutes.
When reading the Checkmk documentation I found two possible options:
Use Prometheus Special Agent and PromQL
This seemed at first a good option, but then I noticed that it only supports one value per query.
This doesn’t seem to be maintainable since checkmk needs to identify and monitor when new nodes, deployments or pods are created dynamically.
Use Kubernetes Special Agent (KSA) and Dynamic Host Configuration (DHC)
This would monitor what I need, but it seems that only the paid Entreprise Edition allows to use DHC.
So my questions are:
1 - Can all the alerts above be generated in checkmk?
2 - Is it possible with the free version?
3 - Should be done with KSA and DHC or in another way?
Hallo,
the next release 2.1 is announced with a new kubernetes support but Im sure it will be a enterprise feature.
You can use the 25 Hosts free version
the new K8s-monitoring is also available in Raw. But you will have to create a script yourself, which replaces the DHC. E.g. Running through the folder of piggyback files, then do the REST-API calls to add them.
Up to you, which path you take.
@danielserrao
The alerts as mentioned by you will be possible with our new monitoring. The beta should start next month to try it out.
PV/PVCs support will not be immediately available, but hopefully included in another release later this year. In the meantime, we will probably provide a backport for users to add PV/PVC monitoring.
Hi @martin.hirschvogel is there any news regarding PV/PVCs support? We use checkmk to monitor productive kubernetes clusters, so the volumes are also very important to us.
It is planned for development. Our current prioritization for services is
Services
CronJobs
Persistent Volumes/PVC
We can shuffle this around, if you help us with a good concept for PV/PVC (=talk with us a couple of times and show us how you use PV/PVCs in production)
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.