Kubernetes / OKD restarting containers


how do you deal with containers as hosts in checkmk and k8s / okd restarting the pod creating new container instances but not instantly deleting the old ones that are not running any more?

Checkmk’s dcd creates new hosts for the new container instances but will not remove the not running containers resulting in down hosts alerts.

Do we need to adjust the aggressiveness of k8s’ “garbage collection” to remove old non-running containers earlier?

We have similar problems but how should check-mk decide that a deleted container is not a problem?
Missing systems with a defined tag or all system in a seperate site are ok?

That’s exactly the issue. What is the reason for k8s to have the old containers still around?

I m working with rancher and don t know k8s.
Perhaps a tag or marker that blocks deleting.
My strategy is that all hosts in a special site can be deleted if there is a problem.
Not the perfect way but ok for us.

kubelet Garbage Collection
Is not the “MinAge” a good option to configure how long a dead container is still existing inside the k8s.
And for retrieving log information from a dead container it is not so bad to have it accessible for some time.

Why should there be log information inside a container?

Perhaps as an option to solve your problem ,-)
Container technology needs some more years to „grow up“ allthough containers are a really big thing.

Inside a raw k8s the last information about a dead container is only inside the container if i remember it correctly. But if you don’t need these information for troubleshooting then the “MinAge” setting with a value of 0 should remove every dead container.