Help us improve our Kubernetes monitoring

Hello Checkmk Community!
we want to advance our Kubernetes monitoring. For this, we are looking for Kubernetes cluster admins, who can help us sharpen the requirements.

To get started, I am planning to do some phone/video interviews to better understand your requirements for deployment of Checkmk and a potential monitoring agent inside Kubernetes. I also want to understand what information is crucial for monitoring Kubernetes for you.
Further down the line, I would like to involve users directly in our sprint reviews, where we showcase early iterations and you can give feedback directly.

Interested? Write me a PM or post in this thread.

1 Like

Hallo Martin,
are there plans to monitor addons like longhorn etc…
In my case fully automated monitoring of running containers with getting most informations like checking VMs or physical systems would be great.
Other features:
checking images and layers for defined content
Ralf

Hi,

I actually got some problems using checkmk for monitoring in our kubernetes environment. Some parts are missing or even wrong. We can talk on the phone if you want to on the part what I would like to have included in the checkmk monitoring.

bye
David

@martin.hirschvogel What I would like to know is which of the Kubernetes monitoring ideas you got as feedback from users could not be implemented because they were too complicated/special to do in Checkmk - but still senseful however.

Since the last Robocon I always have the Kubernetes Library for Robot Framework in mind. It was impressive to see in the live demo, how deep Robot Framework could test a Kubernetes cluster via the API. An integration with Robotmk could fill this gap.

If somebody here has ideas for an interesting use case to implement such a Kubernetes test (too sophisticated for cmk, but generic enough to be of common value) with Robot Framework and Robotmk, please contact me. :slight_smile:

Thanks to everyone who also contacted me directly. The discussions were very helpful.
I am currently writing the presentation for the Checkmk conference, where we will share our plans.

And we could probably hire a few more people and still have plenty of work ahead. How did our developer put it today: If you read about it, it’s already deprecated.

As far as I can see, the KubeLibrary from Robot Framework works similar to the current Checkmk monitoring. They both utilize the official Python client for Kubernetes. Which has the big problem, that it can’t keep up with the fast changes of the Kubernetes API itself…
And, it can only use the data provided by the API, which is not enough for a proper performance monitoring (which you can in Checkmk, if you use Prometheus integration).

Basically, yes. However, the RF library allows you to write tests on your own.

It was not meant as a substitution for Checkmk :slight_smile: - I would never do performance/resource monitoring with the Kubernetes Library, when Checkmk can do this, too. The cool thing in my eyes is that you are able to write tests with multiple steps or in combination with other RF libraries. There are some interesting use cases on the library’s project site which inspired me.

But, as said - I do not have a clear picture of what is really needed by K8 admins. It was just an idea.

As most applications run in K8s are web applications, using RF is a great complement.
I can imagine that combining the web tests with infrastructure tests is helpful. Because if the web app doesn’t work, then you could directly attribute it to the source (e.g. the deployment is not working properly vs. the ingress or service are not working properly). Like in a BI tree.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.