Kube Agent with Helm - IPv6 Support missing and/or not functional?

CMK version: 2.2.0p12 / kube_agent main_2022.03.02
OS version: Debian 12

The helm chart is not supporting ipv6. Only with a workaround it is possible to start the agent. But unfortunately we hitting an error. may someone can help me here? We currently deploying a IPv6 only cluster and need that working.

Error message:

CRITICAL:	 2023-09-27 15:04:01,574 - Failed to send container metrics to cluster collector: Internal Server Error
Traceback (most recent call last):
  File "/usr/local/bin/checkmk-container-metrics-collector", line 8, in <module>
    sys.exit(main_container_metrics())
  File "/usr/local/lib/python3.10/site-packages/checkmk_kube_agent/send_metrics.py", line 466, in _main
    worker(session, cluster_collector_base_url, headers, verify)
  File "/usr/local/lib/python3.10/site-packages/checkmk_kube_agent/send_metrics.py", line 355, in container_metrics_worker
    _verify_and_log_cluster_collector_response(
  File "/usr/local/lib/python3.10/site-packages/checkmk_kube_agent/send_metrics.py", line 416, in _verify_and_log_cluster_collector_response
    cluster_collector_response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://checkmk-cluster-collector.monitoring:8080/update_container_metrics
    solved_result = await solve_dependencies(
  File "/usr/local/lib/python3.10/site-packages/fastapi/dependencies/utils.py", line 527, in solve_dependencies
    solved = await call(**sub_values)
  File "/usr/local/lib/python3.10/site-packages/checkmk_kube_agent/api.py", line 158, in authenticate_post
    return authenticate(
  File "/usr/local/lib/python3.10/site-packages/checkmk_kube_agent/api.py", line 95, in authenticate
    token_review_response = session.post(
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 577, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 515, in request
    prep = self.prepare_request(req)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 443, in prepare_request
    p.prepare(
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 318, in prepare
    self.prepare_url(url, params)
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 386, in prepare_url
    raise InvalidURL(*e.args)
requests.exceptions.InvalidURL: Failed to parse: https://2a01:xxx:xx:448e:ffff::1:443/apis/authentication.k8s.io/v1/tokenreviews
2a01:xxx:xx:4482:ffff::aa28:43728 - "GET /health HTTP/1.1" 200
1 Like

Are you sure about this version " kube_agent main_2022.03.02" ? How did you deployed the helm chart?

We are using the kube-agent with this helm chart - https://github.com/Checkmk/checkmk_kube_agent/tree/main/deploy/charts/checkmk

and following settings:

    tlsCommunication:
      enabled: false
      verifySsl: false

    rbac:
      ## PodSecurityPolicy was deprecated in Kubernetes v1.21, and removed from Kubernetes in v1.25.
      pspEnabled: false

    serviceAccount:
      ## Specifies whether a service account should be created
      create: true

    networkPolicy:
      enabled: false

    ## Configuration for cluster-collector
    clusterCollector:
      # can be: "debug", "info", "warning" (default), "critical"
      logLevel: warning

      ipv6: true

      service:
        # if required specify "NodePort" here to expose the cluster-collector via the "nodePort" specified below
        type: NodePort
        port: 8080
        nodePort: 30035

      ingress:
        enabled: false

    ## Configuration for node-collector components (cadvisor, container-metrics, machine-sections)
    nodeCollector:
      # logLevel for container-metrics and machine-sections; can be: "debug", "info", "warning" (default), "critical"
      logLevel: warning

The ipv6 option is a workaround we added because in the default helm chart the container is starting with “0.0.0.0” and yes the default container version is “main_2022.03.02” -https://github.com/Checkmk/checkmk_kube_agent/blob/2befd1e911e54ea8c5e50754d423bb32a43ed197/deploy/charts/checkmk/values.yaml#L33

Using the latest version (main_2023.10.26) with the helm chart ends up in no logs of the checkmk-cluster-collector pod and errors on the readiness/liveness probe:

Liveness probe failed: Get "http://[2a01:xxx:xx:4482:ffff::3703]:10050/health": dial tcp [2a01:xxx:xx:4482:ffff::3703]:10050: connect: connection refused

Readiness probe failed: Get "http://[2a01:xxx:xx:4482:ffff::3703]:10050/health": dial tcp [2a01:xxx:xx:4482:ffff::3703]:10050: connect: connection refused

IPv6 support has arrived: checkmk_kube_agent/.werks/16419 at main · Checkmk/checkmk_kube_agent · GitHub

And you can also configure the listeners now: checkmk_kube_agent/deploy/charts/checkmk/values.yaml at ea870ce694cbb43025a91ae1d5e14c11bd82574e · Checkmk/checkmk_kube_agent · GitHub

1 Like

:+1: thank you, we will test it.