I configured CMK on this server some time ago. It was running fine until about 6 months ago but we only found out recently. Upon checking the graphs for the server, we found out it has not been populated since June 2023.
Installation Steps
I follow this post for the step. Based on the instructions there, I summarized it to the steps below :
Copy the attached file check_mk_agent to /usr/bin/ then add execute permission for the file.
chmod +x check_mk_agent
Check if these directories exist. If not, then create them.
a. Checkmk Library : /usr/check_mk/lib
b. Checkmk Config : /usr/check_mk/conf
c. Checkmk Var : /tmp/check_mk
d. Checkmk Logs : /var/log/check_mk
Add following entry to /etc/services : check_mk 6556/tcp # Checkmk Monitoring Agent
Add following entry to /etc/inetd.conf check_mk stream tcp nowait root /usr/bin/check_mk_agent -d
Copy /etc/profile to / and rename it .profile cp /etc/profile /.profile
Restart inetd using this command : refresh -s inetd
Questions
I am not very familiar with AIX systems so please excuse me for asking this but how do I start and stop check_mk_agent in such installation setting ?
I read something about using command mkssys and wonder if it is possible to do it with check_mk_agent. Does anyone know ?
Thank you in advance for any advice guidance you can provide. Do ask if there’s anything else you need in order to troubleshoot this problem.
OK. Here’s some troubleshooting information. Hope this is useful in order to point me in the right direction. Again, I’m not familiar on AIX but I have to perform the troubleshooting because I’m the guy for Checkmk related matters.
My first test is to telnet to port 6556 on the remote server from my checkmk server. Since it took a long time, I cancelled it and nmap instead. Nmap tell me that port 6556 on remote host is filtered. Which suggests possible firewall issues.
So I went to the NetSec team. To cut tory short, they came back to inform that issue is on the remote server because when they tested, the remote host is resetting the connection.
On a Known-Good AIX server where Checkmk is working as expected, when i execute command netstat -an | grep 6556, I got an output that says something is listening on port 6556. However, on the problematic remote server, it output nothing.
I can only say that the inetd need to be configured on the target server to accept connections on 6556 and that inet need to know that the agent script should be executed.
But that should be the case with your mentioned steps done.
You could check on the AIX if port 6556 is listening with something like netstat or lsof - i don’t know what is available there.
This is 100% a AIX problem and not directly from CheckMK.
I can’t help you troubleshoot this issue which seems to be AIX + inetd related, however, we never use this way as
a) it’s more complicated to setup
b) it’s insecure - cleartext agent communication
The alternative - and I think @mschlenker wanted to edit the docs to reflect this as the suggested way for checkmk on AIX - is using the “legacy” ssh mode, which feels less legacy to me then inetd. Monitoring Linux in legacy mode it’s called “linux” but works exactly the same for AIX
We have some article draft “Monitoring Unix”, but work on this has stalled for several reasons. One is the behavior of the several inetd implementations. There are substantial differences between GNU inetd, xinetd, OpenBSD inetd, 4.4BSD inetd, BusyBox inetd and the others.
Currently, our long term focus is to make the Agent Controller work on other Unices than Linux on x86/64, which is mostly startup configuration and packaging. In the short term, the “Monitoring Unix” article might be released, but with rather generic inetd related instructions, leaning more towards @gstolz’ suggestion of invoking the agent via the OpenSSH server.
The Unix person have scheduled a restart of inetd as part of troubleshooting steps. Hopefully this resolves the issue.
I have read about the SSH method and are planning to transition to this method when I migrate current Checkmk to another server. Thinking of doing an upgrade for awhile now.
I thought just want to provide an update on the resolution of this issue.
So basically issue was resolved after the UNIX person stopped and started the inetd. CheckMK is now able to get the data from the agent on the remote AIX host. I have to go in a force a check for some of the ports that we are also monitoring.
Thanks again for all the recommendations. Take care and have a good day.