[Check_mk (english)] Checking if backups are working

There are a few different ways.

One way is to copy off the /opt/omd/sites/SITENAME/var/check_mk/wato/snapshots/ dir, and if you need to restore, simply restore the latest snapshot. The downside is that this auto-snapshot doesn’t do more than back up the basic configuration and event console data. If you want the performance data, etc., then you’ll need to create the snapshot by hand. Would love for someone how to tell me how to control what’s in the snapshot if I want more than basic config backed up.

Another way is via the check_mk command (i.e. cmk --backup FILE.tar.gz and --restore). I have a script that runs nightly to back it up to a network share.

Another way is to use the omd command (omd backup) but I haven’t tested that.

"Date: Mon, 25 Jul 2016 14:40:43 +0000
checkmk-en@lists.mathias-kettner.de
Message-ID:
BN6PR07MB27703291448508699DABECADD30D0@BN6PR07MB2770.namprd07.prod.outlook.com

Content-Type: text/plain; charset=“iso-8859-1”

Hello,

I am trying to find a way to monitor our backups on network drives using Check_MK.

We are using Tivoli for backups.

I am assuming I will need to create some kind of script but I am not sure where to start.

Any help or advice will be appreciated.

Best Regards,

Laura"

···

From: Laura DiMauro
To: “checkmk-en@lists.mathias-kettner.de
Subject: Re: [Check_mk (english)] Checking if backups are working

We used the piggyback method:

  1. On our central Tivoli Server we produce a daily report of backups - 1 line per server

  2. From this report a local check (which goes into /usr/lib/check_mk_agent/local is generated which creates piggyback data
    which can reside on the Tivoli server or in our case copied to the Monitoring master server.

    eg.

    #!/bin/sh

echo “<<<<server-a.com>>>>”

echo “<<>>”
echo “0 tsm_backup - Backup Completed”

echo “<<<<>>>>”
echo “<<<<server-a.com>>>>”

echo “<<>>”

echo “2 tsm_backup - Backup Terminated”

echo “<<<<>>>>”

Please note the use of two and four groups.

Once this piggy back data is sent, there will be a service ‘tsm_backup’ appear on each host specified.

Regards

Allan Thorne

eSolutions

Monash University

···

On 26 July 2016 at 06:16, Mathieu Levi mlevi@collective.com wrote:

There are a few different ways.

One way is to copy off the /opt/omd/sites/SITENAME/var/check_mk/wato/snapshots/ dir, and if you need to restore, simply restore the latest snapshot. The downside is that this auto-snapshot doesn’t do more than back up the basic configuration and event console data. If you want the performance data, etc., then you’ll need to create the snapshot by hand. Would love for someone how to tell me how to control what’s in the snapshot if I want more than basic config backed up.

Another way is via the check_mk command (i.e. cmk --backup FILE.tar.gz and --restore). I have a script that runs nightly to back it up to a network share.

Another way is to use the omd command (omd backup) but I haven’t tested that.

"Date: Mon, 25 Jul 2016 14:40:43 +0000
From: Laura DiMauro
To: “checkmk-en@lists.mathias-kettner.de
checkmk-en@lists.mathias-kettner.de
Subject: Re: [Check_mk (english)] Checking if backups are working
Message-ID:
BN6PR07MB27703291448508699DABECADD30D0@BN6PR07MB2770.namprd07.prod.outlook.com

Content-Type: text/plain; charset=“iso-8859-1”

Hello,

I am trying to find a way to monitor our backups on network drives using Check_MK.

We are using Tivoli for backups.

I am assuming I will need to create some kind of script but I am not sure where to start.

Any help or advice will be appreciated.

Best Regards,

Laura"


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

Allan Thorne

Infrastructure Operations and Support

Monash University

738 Blackburn Road, Clayton

Monash University, VIC 3800

Telephone: +61 3 9905 4791

Mobile: +61 (0)408 991 028

Email: allan.thorne@monash.edu

CRICOS Provider 00008C/ 01857J

1 Like

We use Bacula for backups, which does things differently, but seems pretty similar too.

I had a similar problem - how to monitor something per-client that is run on a central server? The approach I took was:

  • Get Bacula to run a script on each client after the backup is successful. It just drops a one line text file that has the date and the backup job number in it (it happens to be nicely human readable too)

  • Write a local check to read that file and parse the date out of it. If the date > 30 hours, then go to Warning, if > 50 hours, then go to critical. That means if Bacula fails to visit at least daily, then we’ll get an alert from the system that’s been affected. If everything is okay, the check just echoes out the contents of the file.

  • Have Bacula notify me of all failures (and actually, it’s producing a summary like ‘the backups on these 49 servers all succeeded:’ too) - we use Slack, so it’s posting those notification to a Slack channel, but email would be fine too.

The last step is the alert that tells you to go do some work (and that we actually ran some backups last night). The first two steps are really just a secondary check to make sure Bacula is running properly, that we didn’t misconfigure a firewall which prevents backups to a host, and that all production hosts are in Bacula’s config.

In your case, if you have a summary file on your Tivoli server with one line per client, then you can write a local check (on your tivoli server) that parses that file and looks for failures (and goes critical if there are any). You could get all fancy and have it count successes and failures and put them in as ‘performance stats’ so you get graphs in CheckMK too, if you wanted.

In fact, having just thought through your situation, I might well do something similar for myself - that way we’d get a CheckMK alert (as well as a Slack notification) when a backup fails in production - the good thing is that when a failed backup was run successfully, it would clear the alert (which Slack/email doesn’t show very well). Could be cool :slight_smile:

…Ralph

···

On Mon, Jul 25, 2016 at 11:10 PM, Allan Thorne allan.thorne@monash.edu wrote:

We used the piggyback method:

  1. On our central Tivoli Server we produce a daily report of backups - 1 line per server
  1. From this report a local check (which goes into /usr/lib/check_mk_agent/local is generated which creates piggyback data
    which can reside on the Tivoli server or in our case copied to the Monitoring master server.

    eg.

#!/bin/sh

echo “<<<<server-a.com>>>>”

echo “<<>>”
echo “0 tsm_backup - Backup Completed”

echo “<<<<>>>>”
echo “<<<<server-a.com>>>>”

echo “<<>>”

echo “2 tsm_backup - Backup Terminated”

echo “<<<<>>>>”

Please note the use of two and four groups.

Once this piggy back data is sent, there will be a service ‘tsm_backup’ appear on each host specified.

Regards

Allan Thorne

eSolutions

Monash University


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

On 26 July 2016 at 06:16, Mathieu Levi mlevi@collective.com wrote:

There are a few different ways.

One way is to copy off the /opt/omd/sites/SITENAME/var/check_mk/wato/snapshots/ dir, and if you need to restore, simply restore the latest snapshot. The downside is that this auto-snapshot doesn’t do more than back up the basic configuration and event console data. If you want the performance data, etc., then you’ll need to create the snapshot by hand. Would love for someone how to tell me how to control what’s in the snapshot if I want more than basic config backed up.

Another way is via the check_mk command (i.e. cmk --backup FILE.tar.gz and --restore). I have a script that runs nightly to back it up to a network share.

Another way is to use the omd command (omd backup) but I haven’t tested that.

"Date: Mon, 25 Jul 2016 14:40:43 +0000
From: Laura DiMauro
To: “checkmk-en@lists.mathias-kettner.de
checkmk-en@lists.mathias-kettner.de
Subject: Re: [Check_mk (english)] Checking if backups are working
Message-ID:
BN6PR07MB27703291448508699DABECADD30D0@BN6PR07MB2770.namprd07.prod.outlook.com

Content-Type: text/plain; charset=“iso-8859-1”

Hello,

I am trying to find a way to monitor our backups on network drives using Check_MK.

We are using Tivoli for backups.

I am assuming I will need to create some kind of script but I am not sure where to start.

Any help or advice will be appreciated.

Best Regards,

Laura"


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en


Allan Thorne

Infrastructure Operations and Support

Monash University

738 Blackburn Road, Clayton

Monash University, VIC 3800

Telephone: +61 3 9905 4791

Mobile: +61 (0)408 991 028

Email: allan.thorne@monash.edu

CRICOS Provider 00008C/ 01857J

Ralph Bolton

      Systems Administrator

    **            Calltracks

Ltd**

    Email:   ralph.bolton@calltracks.com

    Web:    [www.calltracks.com](http://www.calltracks.com/)

    Tel:      +44 20 3199 9000

    Fax:     +44 20 3199 9009

      High

availability
call monitoring, tracking and NTS services. The opinions
expressed are those of
the individual and not the company. Internet communications
are not secure and
therefore Calltracks Ltd (“the company”) does not accept
liability
for any claims arising as a result of the use of this medium
for transmissions
by or to the company. This email and any files transmitted
with it are
confidential. If you are not the intended recipient, you are
hereby notified
that any disclosure, distribution or copying of this
communication is strictly
prohibited. Whilst we take every reasonable precaution to
screen out computer
viruses from emails, attachments to the email may contain such
viruses. We
cannot accept liability for loss or damage resulting from such
viruses.
Calltracks Ltd is registered in England and Wales 6539973
at Unit
15, 3rd Floor, 23-28 Penn Street, London, N1 5DL

For Bacula system i found a plugin on Github.com and works fine, but you have to use postgree as db.

https://github.com/sts/checkmk/tree/master/bacula

···

2016-07-26 14:30 GMT+02:00 Ralph Bolton ralph.bolton@calltracks.com:

We use Bacula for backups, which does things differently, but seems pretty similar too.

I had a similar problem - how to monitor something per-client that is run on a central server? The approach I took was:

  • Get Bacula to run a script on each client after the backup is successful. It just drops a one line text file that has the date and the backup job number in it (it happens to be nicely human readable too)
  • Write a local check to read that file and parse the date out of it. If the date > 30 hours, then go to Warning, if > 50 hours, then go to critical. That means if Bacula fails to visit at least daily, then we’ll get an alert from the system that’s been affected. If everything is okay, the check just echoes out the contents of the file.
  • Have Bacula notify me of all failures (and actually, it’s producing a summary like ‘the backups on these 49 servers all succeeded:’ too) - we use Slack, so it’s posting those notification to a Slack channel, but email would be fine too.

The last step is the alert that tells you to go do some work (and that we actually ran some backups last night). The first two steps are really just a secondary check to make sure Bacula is running properly, that we didn’t misconfigure a firewall which prevents backups to a host, and that all production hosts are in Bacula’s config.

In your case, if you have a summary file on your Tivoli server with one line per client, then you can write a local check (on your tivoli server) that parses that file and looks for failures (and goes critical if there are any). You could get all fancy and have it count successes and failures and put them in as ‘performance stats’ so you get graphs in CheckMK too, if you wanted.

In fact, having just thought through your situation, I might well do something similar for myself - that way we’d get a CheckMK alert (as well as a Slack notification) when a backup fails in production - the good thing is that when a failed backup was run successfully, it would clear the alert (which Slack/email doesn’t show very well). Could be cool :slight_smile:

…Ralph


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

Distinti Saluti,

Walter Tosolini

System Engineer

COGITO srl

Via Tavagnacco 63, 33100 Udine

Tel. +39 0432 486316

Fax +39 0432 1632281

Dati societari: www.cogitoweb.it - e-mail: info@cogitoweb.it

Il contenuto di questa mail è tutelato dalla normativa sulla Privacy :

http://www.cogitoweb.it/privacy

On Mon, Jul 25, 2016 at 11:10 PM, Allan Thorne allan.thorne@monash.edu wrote:

We used the piggyback method:

  1. On our central Tivoli Server we produce a daily report of backups - 1 line per server
  1. From this report a local check (which goes into /usr/lib/check_mk_agent/local is generated which creates piggyback data
    which can reside on the Tivoli server or in our case copied to the Monitoring master server.

    eg.

#!/bin/sh

echo “<<<<server-a.com>>>>”

echo “<<>>”
echo “0 tsm_backup - Backup Completed”

echo “<<<<>>>>”
echo “<<<<server-a.com>>>>”

echo “<<>>”

echo “2 tsm_backup - Backup Terminated”

echo “<<<<>>>>”

Please note the use of two and four groups.

Once this piggy back data is sent, there will be a service ‘tsm_backup’ appear on each host specified.

Regards

Allan Thorne

eSolutions

Monash University


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

Ralph Bolton

      Systems Administrator

    **            Calltracks

Ltd**

    Email:   ralph.bolton@calltracks.com

    Web:    [www.calltracks.com](http://www.calltracks.com/)

    Tel:      +44 20 3199 9000

    Fax:     +44 20 3199 9009

      High

availability
call monitoring, tracking and NTS services. The opinions
expressed are those of
the individual and not the company. Internet communications
are not secure and
therefore Calltracks Ltd (“the company”) does not accept
liability
for any claims arising as a result of the use of this medium
for transmissions
by or to the company. This email and any files transmitted
with it are
confidential. If you are not the intended recipient, you are
hereby notified
that any disclosure, distribution or copying of this
communication is strictly
prohibited. Whilst we take every reasonable precaution to
screen out computer
viruses from emails, attachments to the email may contain such
viruses. We
cannot accept liability for loss or damage resulting from such
viruses.
Calltracks Ltd is registered in England and Wales 6539973
at Unit
15, 3rd Floor, 23-28 Penn Street, London, N1 5DL

On 26 July 2016 at 06:16, Mathieu Levi mlevi@collective.com wrote:

There are a few different ways.

One way is to copy off the /opt/omd/sites/SITENAME/var/check_mk/wato/snapshots/ dir, and if you need to restore, simply restore the latest snapshot. The downside is that this auto-snapshot doesn’t do more than back up the basic configuration and event console data. If you want the performance data, etc., then you’ll need to create the snapshot by hand. Would love for someone how to tell me how to control what’s in the snapshot if I want more than basic config backed up.

Another way is via the check_mk command (i.e. cmk --backup FILE.tar.gz and --restore). I have a script that runs nightly to back it up to a network share.

Another way is to use the omd command (omd backup) but I haven’t tested that.

"Date: Mon, 25 Jul 2016 14:40:43 +0000
From: Laura DiMauro
To: “checkmk-en@lists.mathias-kettner.de
checkmk-en@lists.mathias-kettner.de
Subject: Re: [Check_mk (english)] Checking if backups are working
Message-ID:
BN6PR07MB27703291448508699DABECADD30D0@BN6PR07MB2770.namprd07.prod.outlook.com

Content-Type: text/plain; charset=“iso-8859-1”

Hello,

I am trying to find a way to monitor our backups on network drives using Check_MK.

We are using Tivoli for backups.

I am assuming I will need to create some kind of script but I am not sure where to start.

Any help or advice will be appreciated.

Best Regards,

Laura"


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en


Allan Thorne

Infrastructure Operations and Support

Monash University

738 Blackburn Road, Clayton

Monash University, VIC 3800

Telephone: +61 3 9905 4791

Mobile: +61 (0)408 991 028

Email: allan.thorne@monash.edu

CRICOS Provider 00008C/ 01857J

Thank you all for the suggestions!

···

From: tosolini, walter walter.tosolini@cogitoweb.it
Sent: Tuesday, July 26, 2016 7:29:12 AM
To: Ralph Bolton
Cc: Allan Thorne; Mathieu Levi; Laura DiMauro; checkmk-en
Subject: Re: [Check_mk (english)] Checking if backups are working

For Bacula system i found a plugin on Github.com and works fine, but you have to use postgree as db.

https://github.com/sts/checkmk/tree/master/bacula

Distinti Saluti,

Walter Tosolini

System Engineer

COGITO srl

Via Tavagnacco 63, 33100 Udine

Tel. +39 0432 486316

Fax +39 0432 1632281

Dati societari: www.cogitoweb.it - e-mail: info@cogitoweb.it

Il contenuto di questa mail è tutelato dalla normativa sulla Privacy :

http://www.cogitoweb.it/privacy

2016-07-26 14:30 GMT+02:00 Ralph Bolton
ralph.bolton@calltracks.com:

We use Bacula for backups, which does things differently, but seems pretty similar too.

I had a similar problem - how to monitor something per-client that is run on a central server? The approach I took was:

  • Get Bacula to run a script on each client after the backup is successful. It just drops a one line text file that has the date and the backup job number in it (it happens to be nicely human readable too)
  • Write a local check to read that file and parse the date out of it. If the date > 30 hours, then go to Warning, if > 50 hours, then go to critical. That means if Bacula fails to visit at least daily, then we’ll get an alert from the system that’s been affected.
    If everything is okay, the check just echoes out the contents of the file.
  • Have Bacula notify me of all failures (and actually, it’s producing a summary like ‘the backups on these 49 servers all succeeded:’ too) - we use Slack, so it’s posting those notification to a Slack channel, but email would be fine too.

The last step is the alert that tells you to go do some work (and that we actually ran some backups last night). The first two steps are really just a secondary check to make sure Bacula is running properly, that we didn’t misconfigure a firewall which
prevents backups to a host, and that all production hosts are in Bacula’s config.

In your case, if you have a summary file on your Tivoli server with one line per client, then you can write a local check (on your tivoli server) that parses that file and looks for failures (and goes critical if there are any). You could get all fancy
and have it count successes and failures and put them in as ‘performance stats’ so you get graphs in CheckMK too, if you wanted.

In fact, having just thought through your situation, I might well do something similar for myself - that way we’d get a CheckMK alert (as well as a Slack notification) when a backup fails in production - the good thing is that when a failed backup was
run successfully, it would clear the alert (which Slack/email doesn’t show very well). Could be cool :slight_smile:

…Ralph

On Mon, Jul 25, 2016 at 11:10 PM, Allan Thorne
allan.thorne@monash.edu wrote:

We used the piggyback method:

  1. On our central Tivoli Server we produce a daily report of backups - 1 line per server
  1. From this report a local check (which goes into /usr/lib/check_mk_agent/local is generated which creates piggyback data

    which can reside on the Tivoli server or in our case copied to the Monitoring master server.

    eg.

#!/bin/sh

echo “<<<<server-a.com>>>>”

echo “<<>>”

echo "0 tsm_backup - Backup Completed"

echo “<<<<>>>>”
echo “<<<<server-a.com>>>>”

echo “<<>>”

echo “2 tsm_backup - Backup Terminated”

echo “<<<<>>>>”

Please note the use of two and four groups.

Once this piggy back data is sent, there will be a service ‘tsm_backup’ appear on each host specified.

Regards

Allan Thorne

eSolutions

Monash University

On 26 July 2016 at 06:16, Mathieu Levi mlevi@collective.com wrote:

There are a few different ways.

One way is to copy off the /opt/omd/sites/SITENAME/var/check_mk/wato/snapshots/ dir, and if you need to restore, simply restore the latest snapshot. The downside is that this auto-snapshot doesn’t do more than back up the basic configuration and event
console data. If you want the performance data, etc., then you’ll need to create the snapshot by hand. Would love for someone how to tell me how to control what’s in the snapshot if I want more than basic config backed up.

Another way is via the check_mk command (i.e. cmk --backup FILE.tar.gz and --restore). I have a script that runs nightly to back it up to a network share.

Another way is to use the omd command (omd backup) but I haven’t tested that.

"Date: Mon, 25 Jul 2016 14:40:43 +0000

From: Laura DiMauro

To: “checkmk-en@lists.mathias-kettner.de

    <checkmk-en@lists.mathias-kettner.de>

Subject: Re: [Check_mk (english)] Checking if backups are working

Message-ID:

    <BN6PR07MB27703291448508699DABECADD30D0@BN6PR07MB2770.namprd07.prod.outlook.com>

Content-Type: text/plain; charset=“iso-8859-1”

Hello,

I am trying to find a way to monitor our backups on network drives using Check_MK.

We are using Tivoli for backups.

I am assuming I will need to create some kind of script but I am not sure where to start.

Any help or advice will be appreciated.

Best Regards,

Laura"


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en


Allan Thorne

Infrastructure Operations and Support

Monash University

738 Blackburn Road, Clayton

Monash University, VIC 3800

Telephone:
+61 3 9905 4791

Mobile:
+61 (0)408 991 028

Email: allan.thorne@monash.edu

CRICOS Provider 00008C/ 01857J


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

Ralph Bolton

Systems Administrator

Calltracks Ltd

Email: ralph.bolton@calltracks.com

Web: www.calltracks.com

Tel:
+44 20 3199 9000

Fax:
+44 20 3199 9009

High availability call monitoring, tracking and NTS services. The opinions expressed are those of the individual and not the company. Internet communications are not secure and therefore Calltracks Ltd (“the company”)
does not accept liability for any claims arising as a result of the use of this medium for transmissions by or to the company. This email and any files transmitted with it are confidential. If you are not the intended recipient, you are hereby notified that
any disclosure, distribution or copying of this communication is strictly prohibited. Whilst we take every reasonable precaution to screen out computer viruses from emails, attachments to the email may contain such viruses. We cannot accept liability for loss
or damage resulting from such viruses. Calltracks Ltd is registered in England and Wales 6539973 at Unit 15, 3rd Floor, 23-28 Penn Street, London, N1 5DL


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

1 Like