Hello All,
I am new to check_mk
I am setting up a small site (only a few servers) - and successfully set one server (server rapps) with local checks running.
I’d like to monitor the server which is running checkmk server software (server ramon). So I installed check_mk_agent on that same server (ramon) and was expecting to run local checks in the same manner I ran them on rapps. I placed a simple script into /usr/lib/check_mk_agent/local/ but it is not executed by the agent/server. I was able to add ramon as a host and see some services that check_mk server picked up - they are shown/categorized differently from rapps, but most importantly local checks are not picked up.
The script I placed in /usr/lib/check_mk_agent/ has correct executable permissions for user root and when I run it, it returns expected results.
check_mk_agent local directory matches.
Is there a way to turn on some debugging to see what the agent is doing?
Perhaps the fact that this is a dual setup server/agent causes something to not work correctly?
Also Check_MK Agent shows warning and says that there are no local checks on ramon:
Check_MK Agent Version: 2.1.0p27, OS: linux, TLS is not activated on monitored host (see details)**WARN**, Agent plugins: 0, Local checks: 0
root@rapps:/etc/check_mk# cmk-agent-ctl status
Version: 2.1.0p27
Agent socket: operational
IP allowlist: any
Connection: ramon:8000/mon
UUID: 267e1bfb-7218-433f-a9be-9e9f452d6231
Local:
Connection type: pull-agent
Certificate issuer: Site 'mon' local CA
Certificate validity: Fri, 28 Jul 2023 23:25:26 +0000 - Wed, 28 Nov 3021 23:25:26 +0000
Remote:
Connection type: pull-agent
Registration state: operational
Host name: rapps
Also, I am getting Empty payload from controller at 192.168.1.20:6556 when I test connection to host ramon]. Which is probably why it never gets to local checks
in my attempts to understand/fix the problem I am now running into a communication
issue between server and agent on ramon:
No services found. If you expect this host to have (vanished) services, it probably means that one of the confured data sources is not operating as expected. Take a look at the *Check_MK* service to see what is wrong.
Thank you for your reply Simon. I tried this and other variations of host names and IP address for the local host. When ramon.altre.local is used as host name (or 127.0.0.1 as IP address) with local the agent test still fails with empty payload, but all other tests succeed.
When I use IP address 192.168.1.20 ping test succeeds, but all others fail. The working host used “–server ramon:8000” in registration command and it is working fine. My current registration for ramon is also using --server ramon:8000. The tests are using IP address 192.168.1.20 the check_mk_agent still reports empty payload
Empty payload from controller at 192.168.1.20:6556
It seems like I have some type of disconnect between cmk-agent-ctl process and check_mk_agent.
They seem to function ok as far as I can tell by the behavior each of them exhibits separately i.e.
cmk-agent-ctl works, socket is created and listening on port 5665 (this gets created after registration occurs). Once registration is deleted or the TLS registration is removed from UI, port 5665 clears up.
OK, I figured this out. Here’s the report the record
my first problem was due to failure to register the host. The host names I ended up with is similar to the one from host rapps, which has been working fine:
The 2nd problem was my misreading the use of systemd vs inetd with starting the agent.
I manually killed an agent process and that lead me down the wrong path. I restored an agent later by re running the systemd service:
systemctl start check-mk-agent-async.service
and later turned on logging in the agent by adding -d flag to check-mk-agent-async.service file:
ExecStart=/usr/bin/check_mk_agent -d
and started looking into syslog, which reported a failure
Feb 18 15:00:28 ramon cmk-agent-ctl[1214837]: WARN [cmk_agent_ctl::modes::pull] [::ffff:192.168.1.20]:39106: Request failed. (Error collecting monitoring data.)
In looking this up and reading about I realized that check-mk-agent.socket was down.
Starting it back up fixed the problem.
systemctl start check-mk-agent.socket
so all this was due to the operator error (i.e. myself), so now I know more about the software and hopefully this will be a smooth ride and possibly this post will help someone in the future.
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.