Hello,
I have multiple Checkmk systems running, all controlled from one of those instances. Attached to each of these Checkmk systems there are a number of external devices, which can only be reached by calling a specific port. The response from the devices has to be cleaned and parsed in order to get the wanted metrics.
Because these devices are external, my first try was to create active checks, meaning a bash script that calls a python script which handles the calling and parsing of the data.
Here is a simple version of the bash script:
#!/bin/bash
var=$(python3 /tmp/my_test_script.py -i $1 -p $2 > &1)
if [[ $((var+0)) -lt $3 ]]; then
echo "All good"
exit 0
else
echo "It is bad"
exit 1
fi
and here the python script:
def main(ip, port, limit):
server_address = (ip, port)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(1)
sock.connect(server_address)
message = "secret"
error=0
try:
sock.sendall(message)
data = sock.recv(30)
except:
exit(1)
finally:
print(data)
exit(0)
def parse_arguments():
parser = argparse.ArgumentParser()
req_arg_group = parser.add_argument_group("Required arguments")
req_arg_group.add_argument("--ip", "-i", required=True, help="Specify host IP")
req_arg_group.add_argument("--port", "-p", required=True, help="Specify host port")
req_arg_group.add_argument("--limit", "-l", required=True, help="Specify threshold")
return parser.parse_args()
if __name__=="__main__":
args = parse_arguments()
main(args.ip, args.port, args.limit)
Then I add in Checkmk an active check with the parameters for the bash script (ip and port of device, and the threshold). As a first try it works well, of course messages and exit codes has to be improved.
This is a very cumbersome method, so I want to ask if I am missing an obvious way to monitor these systems. No answer or suggestion is too obvious, as I am still very new to this.
Many thanks!