Getting error: no unmonitored services found, no vanished services found [special_kubernetes] Agent exited with code 1: (404)

Also, the IP address of the API server differs from within the cluster to outside the cluster.
As you have deployed Checkmk within Kubernetes, you also need to use the ClusterIP of the kubernetes service, e.g.
mh@klapp-0141:~$ kubectl get svc kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.100.0.1 443/TCP 105d

Hi,

I have set selected one of hosts: ā€œmasterā€ under conditions Tab ā†’ explicit hosts which i was missing earlierā€¦as identified from your screenshotā€¦and changed Api server endpoint to IP address and Port: 6443, as my API server endpoint is exposed to this port numberā€¦

Activated the changes
Still services are not showing for master host, please guide further

Now i try to run debug logsā€¦
Below is the output

OMD[checkmk]:~$ cmk -D master

master                                                                         
Addresses:              10.160.0.12
Tags:                   [address_family:ip-v4-only], [agent:all-agents], [criticality:prod], [ip-v4:ip-v4], [networking:lan], [piggyback:auto-piggyback], [site:checkmk], [snmp_ds:no-snmp], [tcp:tcp]
Labels:                 
Host groups:            check_mk
Contact groups:         all
Agent mode:             Normal Checkmk agent, all configured special agents
Type of agent:          
  TCP: 10.160.0.12:6556
  Program: /omd/sites/checkmk/share/check_mk/agents/special/agent_kubernetes --pwstore=2@0@kubernetes '--token' '************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************' '--infos' 'nodes,services,ingresses,deployments,pods,endpoints,daemon_sets,stateful_sets,jobs' '--no-cert-check' '--api-server-endpoint' 'https://10.160.0.12' '--port' '6443'
  Process piggyback data from /omd/sites/checkmk/tmp/check_mk/piggyback/master
Services:
  checktype item params description groups
  --------- ---- ------ ----------- ------
OMD[checkmk]:~$

Thanks
Nitin Goyal

okā€¦let me try to use cluster ip of kubernetes service instead of api-server ipā€¦thanks

Please share output of:

/omd/sites/checkmk/share/check_mk/agents/special/agent_kubernetes --pwstore=2@0@kubernetes '--token' '************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************' '--infos' 'nodes,services,ingresses,deployments,pods,endpoints,daemon_sets,stateful_sets,jobs' '--no-cert-check' '--api-server-endpoint' 'https://10.160.0.12' '--port' '6443' --debug --verbose

I am getting below output for above command

OMD[checkmk]:/$ /omd/sites/checkmk/share/check_mk/agents/special/agent_kubernetes --pwstore=2@0@kubernetes '--token' '************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************' '--infos' 'nodes,services,ingresses,deployments,pods,endpoints,daemon_sets,stateful_sets,jobs' '--no-cert-check' '--api-server-endpoint' 'https://10.160.0.12' '--port' '6443' --debug --verbose
Traceback (most recent call last):
  File "/omd/sites/checkmk/share/check_mk/agents/special/agent_kubernetes", line 12, in <module>
    sys.exit(main())
  File "/omd/sites/checkmk/lib/python3/cmk/special_agents/agent_kubernetes.py", line 1415, in main
    api_data = ApiData(
  File "/omd/sites/checkmk/lib/python3/cmk/special_agents/agent_kubernetes.py", line 1195, in __init__
    ingresses: Iterator = ext_api.list_ingress_for_all_namespaces().items
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/apis/extensions_v1beta1_api.py", line 2714, in list_ingress_for_all_namespaces
    (data) = self.list_ingress_for_all_namespaces_with_http_info(**kwargs)
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/apis/extensions_v1beta1_api.py", line 2795, in list_ingress_for_all_namespaces_with_http_info
    return self.api_client.call_api('/apis/extensions/v1beta1/ingresses', 'GET',
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/api_client.py", line 330, in call_api
    return self.__call_api(resource_path, method,
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/api_client.py", line 163, in __call_api
    response_data = self.request(method, url,
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/api_client.py", line 351, in request
    return self.rest_client.GET(url,
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/rest.py", line 227, in GET
    return self.request("GET", url,
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/rest.py", line 222, in request
    raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': '9cc4fadf-ded0-436d-a1fb-178447c9163a', 'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'a52fff19-1f48-48a4-98da-9bb46153794b', 'X-Kubernetes-Pf-Prioritylevel-Uid': '83efeff5-87fa-4139-bf31-0b607acadea2', 'Date': 'Wed, 25 May 2022 10:46:23 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found

nigoyal7@master:~$ kubectl get svc kubernetes
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   3d20h
nigoyal7@master:~$

I tried with clusterIp and port for kubernetes service by entering custom url: https://10.96.0.1:443 in checkmk under kubernetes rules and then checked output for above commandā€¦and gettng same output in both the casesā€¦please suggestā€¦

Thanks
Nitin Goyal

@nigoyal7

nigoyal7@master:~$ kubectl get svc kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 443/TCP 3d20h
nigoyal7@master:~$

The cluster IP is fetched from ā€œkubectl config viewā€ command and not from ā€œkubectl get svc kubernetesā€.

@chauhan_sudhir - I followed as suggested by Martin above, that api-server is getting changed as we move outside of cluster, so i tried with cluster ip of kubernetes serviceā€¦kube config view is again reflecting same api server ip: 10.160.0.12 and port: 6443, which i already tried that way ā€¦in both cases i am getting same output for command you givenā€¦

Thanks
Nitin Goyal

What is the output of ā€œkubectl config viewā€ ?

Chapter 2.1 says about deploying the rbac yaml file Monitoring Kubernetes
Did you already do that ?

Below is output for kubectl config view command:

nigoyal7@master:~$ kubectl config view
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://10.160.0.12:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubernetes-admin
  name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED
nigoyal7@master:~$

Yes i have deployed rbac yaml, and done configuration as described in section 2.1

Depends where the Checkmk server is located.

Nitin, can you please confirm that you set-up Checkmk inside a Kubernetes cluster and you are trying to monitor that cluster? Because every other step depends on that.

If Checkmk is inside the Kubernetes cluster, than Checkmk has to contact the Kubernetes API server via the internal Kubernetes network and you have to ensure that this internal K8s service is accessible for Checkmk.
If Checkmk is outside the Kubernetes cluster, follow the steps in the documentation as pointed out by Sudhir.

I can only recommend to run Checkmk in a Kubernetes cluster, if you are an absolute Kubernetes expert and have deep expertise of Kubernetes networking. If you are not too familiar with Kubernetes, please run Checkmk on a dedicated VM.

Try specifying the token explicity and not in the password store and see if you see any output to this command. It should look like this with the token:

/omd/sites/v200p24/share/check_mk/agents/special/agent_kubernetes '--token' 'blablablabla..bla' '--infos' 'nodes,services,ingresses,deployments,pods,endpoints,daemon_sets,stateful_sets,jobs' '--no-cert-check' '--api-server-endpoint' 'https://1.2.3.4 '--port' '443'

Hi Martin,

I am deploying checkmk in kubernetes cluster itself by following below guide:

And further configuring monitoring for kubernetes by following guide:

Thanks
Nitin Goyal

Hi Sudhir,

Please find below outputā€¦for mentioned commandā€¦

OMD[checkmk]:/omd/sites$ /omd/sites/checkmk/share/check_mk/agents/special/agent_kubernetes '--token' 'eyJhbGciOiJSUzI1NiIsImtpZCI6InJwb3R2MldBdV81a1VVTWRpb3dpOV85dG5rdW1XZ3doN1RxSjA5Zy15cUEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJjaGVja21rIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImNoZWNrbWstdG9rZW4tNTZ2cDUiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiY2hlY2ttayIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImZjNmVkYjE2LTM5NDUtNDU2Ny1hNTI1LThlMzBmYzQwZWQ4OSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpjaGVja21rOmNoZWNrbWsifQ.5N5rgYhUrLtH9FxXmPik6HGqKwtTDj-IpjiG7yvbHkRwLY0zqy75qKsPvw5EbpDIJ8RX92HmkRKYQaUIRzkkYJbat4RLtvDXakVxNypTuM2PC9JXEltlumUPtH9R38NU4BnbbGHbGpb8wP4u00XEcAcEblonXTSaDfOjj7CwgUmGIdYP-8BrCxMEvi1J2orGIGDZ-zv29OJy2ESRpEfIAfox8Dd61_r0PeIjZsF5D91NJaFzhCi06SX1hdUVH6cjvC1oRNccrTxFAPSE_y_1H2s21n3iC_8T0RnrECqH1tXjdkClvqdCcVAcqpB7MzzCToxz_RktsJn_8Eohtc2TJw' '--infos' 'nodes,services,ingresses,deployments,pods,endpoints,daemon_sets,stateful_sets,jobs' '--no-cert-check' '--api-server-endpoint' 'https://10.160.0.12' '--port' '6443'
(404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': '090663d2-fbe8-45a7-9dfc-605ca0205e20', 'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'a52fff19-1f48-48a4-98da-9bb46153794b', 'X-Kubernetes-Pf-Prioritylevel-Uid': '83efeff5-87fa-4139-bf31-0b607acadea2', 'Date': 'Wed, 25 May 2022 12:53:47 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found

OMD[checkmk]:/omd/sites$

Thanks
Nitin Goyal

Can you reduce this to just ā€œ'nodes,podsā€ and add --debug --verbose to the end of this command and share the output. So, it should be:

OMD[checkmk]:/omd/sites$ /omd/sites/checkmk/share/check_mk/agents/special/agent_kubernetes '--token' 'eyJhbGciOiJSUzI1NiIsImtpZCI6InJwb3R2MldBdV81a1VVTWRpb3dpOV85dG5rdW1XZ3doN1RxSjA5Zy15cUEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJjaGVja21rIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImNoZWNrbWstdG9rZW4tNTZ2cDUiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiY2hlY2ttayIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImZjNmVkYjE2LTM5NDUtNDU2Ny1hNTI1LThlMzBmYzQwZWQ4OSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpjaGVja21rOmNoZWNrbWsifQ.5N5rgYhUrLtH9FxXmPik6HGqKwtTDj-IpjiG7yvbHkRwLY0zqy75qKsPvw5EbpDIJ8RX92HmkRKYQaUIRzkkYJbat4RLtvDXakVxNypTuM2PC9JXEltlumUPtH9R38NU4BnbbGHbGpb8wP4u00XEcAcEblonXTSaDfOjj7CwgUmGIdYP-8BrCxMEvi1J2orGIGDZ-zv29OJy2ESRpEfIAfox8Dd61_r0PeIjZsF5D91NJaFzhCi06SX1hdUVH6cjvC1oRNccrTxFAPSE_y_1H2s21n3iC_8T0RnrECqH1tXjdkClvqdCcVAcqpB7MzzCToxz_RktsJn_8Eohtc2TJw' '--infos' 'nodes,pods' '--no-cert-check' '--api-server-endpoint' 'https://10.160.0.12' '--port' '6443' --debug --verbose

Hi Sudhir,

I reduced options in kubernetes rules and activated changes, then tried command as aboveā€¦below is output, please checkā€¦

OMD[checkmk]:~$ /omd/sites/checkmk/share/check_mk/agents/special/agent_kubernetes '--token' 'eyJhbGciOiJSUzI1NiIsImtpZCI6InJwb3R2MldBdV81a1VVTWRpb3dpOV85dG5rdW1XZ3doN1RxSjA5Zy15cUEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJjaGVja21rIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImNoZWNrbWstdG9rZW4tNTZ2cDUiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiY2hlY2ttayIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImZjNmVkYjE2LTM5NDUtNDU2Ny1hNTI1LThlMzBmYzQwZWQ4OSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpjaGVja21rOmNoZWNrbWsifQ.5N5rgYhUrLtH9FxXmPik6HGqKwtTDj-IpjiG7yvbHkRwLY0zqy75qKsPvw5EbpDIJ8RX92HmkRKYQaUIRzkkYJbat4RLtvDXakVxNypTuM2PC9JXEltlumUPtH9R38NU4BnbbGHbGpb8wP4u00XEcAcEblonXTSaDfOjj7CwgUmGIdYP-8BrCxMEvi1J2orGIGDZ-zv29OJy2ESRpEfIAfox8Dd61_r0PeIjZsF5D91NJaFzhCi06SX1hdUVH6cjvC1oRNccrTxFAPSE_y_1H2s21n3iC_8T0RnrECqH1tXjdkClvqdCcVAcqpB7MzzCToxz_RktsJn_8Eohtc2TJw' '--infos' 'nodes,pods' '--no-cert-check' '--api-server-endpoint' 'https://10.160.0.12' '--port' '6443' --debug --verbose
Traceback (most recent call last):
  File "/omd/sites/checkmk/share/check_mk/agents/special/agent_kubernetes", line 12, in <module>
    sys.exit(main())
  File "/omd/sites/checkmk/lib/python3/cmk/special_agents/agent_kubernetes.py", line 1415, in main
    api_data = ApiData(
  File "/omd/sites/checkmk/lib/python3/cmk/special_agents/agent_kubernetes.py", line 1195, in __init__
    ingresses: Iterator = ext_api.list_ingress_for_all_namespaces().items
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/apis/extensions_v1beta1_api.py", line 2714, in list_ingress_for_all_namespaces
    (data) = self.list_ingress_for_all_namespaces_with_http_info(**kwargs)
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/apis/extensions_v1beta1_api.py", line 2795, in list_ingress_for_all_namespaces_with_http_info
    return self.api_client.call_api('/apis/extensions/v1beta1/ingresses', 'GET',
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/api_client.py", line 330, in call_api
    return self.__call_api(resource_path, method,
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/api_client.py", line 163, in __call_api
    response_data = self.request(method, url,
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/api_client.py", line 351, in request
    return self.rest_client.GET(url,
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/rest.py", line 227, in GET
    return self.request("GET", url,
  File "/omd/sites/checkmk/lib/python3/kubernetes/client/rest.py", line 222, in request
    raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'abe38d9c-42b6-4eb5-9623-dc15153fb9ed', 'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'a52fff19-1f48-48a4-98da-9bb46153794b', 'X-Kubernetes-Pf-Prioritylevel-Uid': '83efeff5-87fa-4139-bf31-0b607acadea2', 'Date': 'Thu, 26 May 2022 09:12:16 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found


OMD[checkmk]:~$

Thanks & Regards,
Nitin Goyal

What is the version of your Kubernetes ?
Is it on-premise or managed ? PLease elaborate here.
What is the container runtime ?

Hi Sudhir,

  1. Kubernetes version: 1.23.1
  2. Kubernetes cluster deployed on Google Compute Engine using kubeadmā€¦
  3. Docker container runtime Engine installedā€¦

Thanks
Nitin Goyal

Can you share the output of this command ? Please execute this from the monitoring server.

telnet 10.160.0.12 6443

Hi Sudhir

I am getting below output for above command

nigoyal7@worker-2:~$ telnet 10.160.0.12 6443
Trying 10.160.0.12...
Connected to 10.160.0.12.
Escape character is '^]'.

Thanks
Nitin Goyal

Thanks for the information. Is it possible for you to try the new K8 monitoring which is only possible on 2.1.0 only?
In the meantime, I am inventigating the problem with the old Kubernetes special agent and get back to you.