CMK version:
2.2.0p16 OS version:
kubernetes v1.27.12+rke2r1 Error message:
It is not really an error but a phenomenon, because checkmk randomly loses the values for the CPU and RAM usage pods
Solution:
After days of searching for the problem, I found out that the --cache-maxsize value is set to 10000, which means that only max 10000 metrics are kept, which is not enough for large clusters like mine. Therefore this option must be set, which was not possible via the Helm chart, but I have added it. See pull request: Add chart option for cluster-collector to set cache-maxsize if a k8s-cluster generate more than 10000 metrics #27
Hey @schmidax! Do you want to post your solution as a dedicated commend and mark it as the solution? That way, it is easier for everyone to see it. Thanks!
Solution:
After days of searching for the problem, I found out that the --cache-maxsize value is set to 10000, which means that only max 10000 metrics are kept, which is not enough for large clusters like mine. Therefore this option must be set, which was not possible via the Helm chart, but I have added it. See pull request: Add chart option for cluster-collector to set cache-maxsize if a k8s-cluster generate more than 10000 metrics #27
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.