I am using check_mk for almost 1 year now , and we I have some experience of using it as monitoring system.
I have and Installation of Check_mk in virtual machines and using distributed monitoring ( 1 master , 2 slaves ) , 1 master and 1 slave ( in DMZ 1 ) and 1 slave ( DMZ 2 )
I am exporting data to influx Db and visualize them in grafana
I am also using the feature of creating the ticket in Service Now and Jira when a critical service event occur.
I have some question on how to have a check_mk High available .
Imagine on Friday at 22h:00 , check_mk is down ( one of the slave in the DMZ is down , infrastructure problems) , so during the week end, I am loosing all the monitoring and metrics , and I can’t raise a ticket to notify the guard for the infrastructure problem , a very big problem
So I am finding the way to have a fall over when one the slave or master is down , I have to restart automatically the backups to perform the monitoring.
Hi Ano,
as far as i understand, you are not using the CheckMK appliance. If you would, you could setup a HA system right away as the appliance has all required features.
If you use the non-commercial version, you need to set up things on your own. Personally, if you require HA and want that supported by Tribe, you should investigate the option to upgrade to Tribe appliances, either virtual or HW.
Hello Kribbit ,
In the first time , we’ve used the raw editon then we upgraded to the paid version
I am using the paid version of check_mk ( CEE 1.6.p9) .
I’ve installed all the check_mks instance in Red hat enterprise Linux 7 in a Hybrid cloud Infrastructure ( Redhat CloudForms / VM Ware , on premises ) .
you should be able to set up the Clustering from the appliance level in that case.
You need to make sure that you do the cluster connection over a different NIC than your live interfaces. According to the documentation, bonded interfaces on a VM will not do you any good. The following picture shows a HW-based setup with bonded interfaces. In your case, you wouldn’t use bonded interfaces. You will still need to setup a cluster IP. In the distributed setup, your master will connect to the slaves via the cluster ip. Please keep in mind - the actual monitoring will still be issued by the “original” IPs and not by the cluster.
If you set it up likewise, the clustered appliance will take over in case of trouble. you do not have to manually sync things around.