Distributed monitoring with CMCdump - license issue and site instance

Hi,

We are using the MSP edition of checkmk for our distributed monitoring setup since several years now.

For a new satellite site, I cannot use livestatus but have to setup CMCdump. It works great, but the login screen of the satellite shows a license warning “.. days left in your free trial”.

From what I gather, it means some features will turn off at the end of the trial period. How can I avoid this? Is it really necessary to register that satellite site as well? I thought, the used services and hosts are counted with the main sites license report, and I do not want it count twice.

Also, all hosts of the satellite site show up in the main site like they are local. The satellite site with ‘livestatus’, however, shows up as a separate site inside checkmk master. How can I make the hosts of the CMCdump satelite show up a separate group or site as well?

Best, Peter

Looks like, I could report the license usage from the satellites and it is counted automatically.

The remaining issue is, how to make the satellite data somehow separate in the master checkmk server like the other normal satellites.

That is not so easy, the shadow hosts are “inside” your master site at the moment. I would recommend, to put the shadow hosts inside there own site on your central server.

Yes, this is what I want to do. But on the master site, I cannot edit the host. When I try to edit the host, it says: “You called this page with an invalid host name.”.

Perhaps, I need to create another instance on the the central master site as a satellite and then add the hosts to that satellite instead? Is that what you are saying?

Yes - you need to import the shadow host config (cmcdump -C) file to the new instance and not to your master instance. Then you can select based on sites what you want to see.

We used a similar approach where each CMCDump file was copied into the “matching” site on the central CMK server. We then created a new “central” site on the central CMK server, and then connected each of the sites on the central server as a distributed site to the central site. This will give you the isolation needed and allow reporting by customer etc.

Something to keep in mind - as the sites/hosts increase the central site can become quite busy as its loading up all the host/service status and metrics frequently.

I now have set this up as proposed bei Andreas:

checkmk Main site has additional instance for the ‘satellite’:

master
subsite

For other users, here a quick documentation of the file transfer setup:

On the remote satellite:

/omd/sites/sat/etc/cron.d/our_cronjobs:

# export site configuration and upload to master
*/5 * * * * $OMD_ROOT/local/our-push-config.sh >/dev/null 2>&1
# update status
* * * * * $OMD_ROOT/local/our-push-status.sh >/dev/null 2>&1

our-push-config.sh

#!/bin/bash
cmcdump -C > /opt/omd/sites/$OMD_SITE/tmp/$OMD_SITE.mk &&
curl -s -m 5 -T /opt/omd/sites/$OMD_SITE/tmp/$OMD_SITE.mk https://master.checkmk.server/upload/$OMD_SITE.mk

our-push-status.sh

#!/bin/bash
cmcdump > /opt/omd/sites/${OMD_SITE}/tmp/state &&
curl -s -m 5 -T /opt/omd/sites/${OMD_SITE}/tmp/state https://master.checkmk.server/upload/${OMD_SITE}.state

In the central extra ‘satellite’:

etc/crond/our-import-cronjobs

# import site config and state ASAP
* * * * * local/bin/wait-for-files.sh >/dev/null 2>&1

and the import script itself:

local/bin/wait-for-files.sh

#!/bin/bash
# import status from satellite as soon as possible
# kill this job with 'pkill inotifywait'
LOCK=/tmp/upload.lock

if lockfile -l 1 $LOCK ; then
  echo $$ > $LOCK
  (
    cd $HOME
    inotifywait -m -e close_write /omd/upload/ | 
    gawk '{print $1$3; fflush()}' | 
    while read f; do
      file=$(basename $f) 
      echo file=$file
      case $file in
	sat.state)
	cat /omd/upload/$file | unixcat tmp/run/live
	;;
	sat.mk)
	cp /omd/upload/$file etc/check_mk/conf.d
	cmk -O
	;;
      esac
    done
    rm -f $LOCK
  )&
  disown
else
  pid=$(cat $LOCK)
  kill $pid
  rm -f $LOCK  
fi
exit 0

The relevant part of the nginx config (ip restricted; not shown here):

	# upload folder for satellites
	location ~ "/upload/*(\.state|\.mk)$" {
           alias     /opt/omd/upload/$1$2;
           client_body_temp_path  /tmp/upload_tmp;
           dav_methods  PUT DELETE MKCOL COPY MOVE;
           create_full_put_path   on;
           dav_access             group:rw  all:r;
	}
	# proxy to sat local satellite shadow site
	location /sat {
		proxy_pass http://localhost:5001;
	}

The advantage of the inotifywait is that the delay is minimized compared to a cron job running every minute. The cron job is only used to make sure the inotifywait script is running.

2 Likes