Hosts flapping since last week, down/up

**CMK version: [2.0.0p22] **
OS version: debian buster - docker

Error message: [agent] Empty output from agent at 13.xx.xx.20:6556CRIT**, Got no information from host, execution time 0.0 sec**

Output of “cmk --debug -vvn hostname”: Sorry, i dont know how to run this since its docker (If it is a problem with checks or plugins)

Hello,

I installed Checkmk using docker on our debian server. I have installed agent to a 17 linux servers, all ubuntu’s and a few debian servers.

I setup my notification to come through pushover and notifying only when ANY>DOWN, when any host goes down since thats what I need to know so that I can keep the uptime high. And I setup checkmk to use status of the agent instead of a ping, to determite whether host is up, since my servers cannot be pinged.

Last Friday was it?- all hosts started to go down suddenly. Also, more services were added and monitored somehow, on my checkmk hosts, becoming yellow. Is there an autoupgrade? Not sure what happened but ever since, hosts are reportedly down and up and down and up, flapping like hell. In fact, those hosts are up in reality, none of them ever went down but they just loosing connection with the agent for some reason. It was working extremely well, all hosts green, but now its just unstable and there are always red hosts, although they are, in fact, fine. (most of the time)

I also wanted to check on domain SSL certificate, so I created another http service to check for that - I am still strugling to setup but its not relevant I believe.

What could be the problem?

Checkmk agent replies on 6556 port, its accessible, service is running and enabled.

Thanks!

I don’t know what really happens on your system but you can have a look at the resource usage of you container. As you are using the RAW edition inside a container you need a very high resource assignment for CPU. Most problems i saw with CheckMK inside containers where resource problems.

Thank you for your reply. I will monitor resources from now on, but from what I have found out, resources are not being maxed.

What could be the reason check-mk plugin keeps failing, flapping? It detects agents fine sometimes and at another second half of them are down.

Now I have changed check host command to use TCP connection instead of agent status, and it works well, but this check-mk service still shows errors in red.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.