[Check_mk (english)] Regarding: check_mk_agent for Mac OS X

Hi list,

I started some work on the agent for OS X, which has been requested a
few weeks ago.

based on the freebsd agent, i figured out the basic stuff, which is by now:

cpu.loads
cpu.threads
df
netctr.combined
uptime

also, i did a LaunchDaemon, based on the nrpe version.

i don't think, it is ready enough for release, but for testing and
working on it.

as this is my first contribution to check_mk - and things are handled
differently on different projects - how do you like it? raw stuff into
git? post it on the list? link it to my own server?

thanks,

dodger

I have a Force10 S4810 switch that I'm successfully checking via the 'if' service check. However, I'm nervous about using the 32-bit version of this on such a beefy switch. I wanted to see what the outputs would look like if I used 'if64'. Then I realized, I don't know how to explicitly make it use 'if64'. I still can't figure it out!

I'm running a slightly old version of OMD pulled from SVN:

omd-0.47.20110509-rh56-23

Hi Michael,

if you have something possibly interesting for other, then please
simply post it here. If it's really good, we *might* even consider
asking you, if we can put it into the main distribution :wink:

Mathias

···

Am 31.08.2011 15:24, schrieb Michael Hirdes:

Hi list,

I started some work on the agent for OS X, which has been requested a
few weeks ago.

based on the freebsd agent, i figured out the basic stuff, which is by now:

cpu.loads
cpu.threads
df
netctr.combined
uptime

also, i did a LaunchDaemon, based on the nrpe version.

i don't think, it is ready enough for release, but for testing and
working on it.

as this is my first contribution to check_mk - and things are handled
differently on different projects - how do you like it? raw stuff into
git? post it on the list? link it to my own server?

thanks,

dodger

_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

Hello list,

I talked to Michael briefly, as I had also been working on my own updates to check_mk_agent.macosx. I'm not sure my <<<df>>> check is ready for prime-time, as it doesn't handle mount points with spaces in the name. Mac OS X loves disk image files for application bundles (.dmgs). For example, mounting googlechrome.dmg looks like this:

/dev/disk2s2 116Mi 116Mi 0Bi 100% /Volumes/Google Chrome

I suppose this isn't important, as I don't think the Linux agent handles mount points with spaces in the names, either. Unfortunately this practice is very common on Mac OS X.

Anyway, here's what I have:

echo '<<<df>>>';
df -kPt hfs | sed -e 's/^\([^ ][^ ]*\) \(.*\)$/\1 hfs \2/' | sed 1d

echo '<<<cpu>>>';
echo `sysctl -n vm.loadavg | tr -d '{}'` `top -l 1 -n 1 | egrep ^Processes: | awk '{print $4"/"$2;}'` `echo 'echo $$' | bash` `sysctl -n hw.ncpu`

echo '<<<uptime>>>';
echo `date +%s` - `sysctl -n kern.boottime | cut -d' ' -f 4,7 | tr ',' '.' | tr -d ' '` | bc

echo '<<<netctr>>>';
date +'%s'; netstat -inb | egrep -v '(^Name|lo|plip)' | grep Link | awk '{ print $1,$7,$5,$6,"0","0","0","0","0",$10,$8,$9,"0","0",$11,"0","0"; }'

···

-----Original Message-----
From: checkmk-en-bounces@lists.mathias-kettner.de [mailto:checkmk-en-
bounces@lists.mathias-kettner.de] On Behalf Of Mathias Kettner
Sent: Thursday, September 08, 2011 2:56 AM
To: checkmk-en@lists.mathias-kettner.de
Subject: Re: [Check_mk (english)] Regarding: check_mk_agent for Mac OS X

Hi Michael,

if you have something possibly interesting for other, then please
simply post it here. If it's really good, we *might* even consider
asking you, if we can put it into the main distribution :wink:

Mathias

Am 31.08.2011 15:24, schrieb Michael Hirdes:
> Hi list,
>
> I started some work on the agent for OS X, which has been requested a
> few weeks ago.
>
> based on the freebsd agent, i figured out the basic stuff, which is by now:
>
> cpu.loads
> cpu.threads
> df
> netctr.combined
> uptime
>
> also, i did a LaunchDaemon, based on the nrpe version.
>
> i don't think, it is ready enough for release, but for testing and
> working on it.
>
> as this is my first contribution to check_mk - and things are handled
> differently on different projects - how do you like it? raw stuff into
> git? post it on the list? link it to my own server?
>
> thanks,
>
> dodger
>
> _______________________________________________
> checkmk-en mailing list
> checkmk-en@lists.mathias-kettner.de
> http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

/dev/disk2s2 116Mi 116Mi 0Bi 100% /Volumes/Google Chrome

one problem i ran in is: mounted images are always full and it is kinda
unpredictable which volume osx uses. so, my solution atm is to unmount
everything during inventory.

I suppose this isn't important, as I don't think the Linux agent handles mount points with spaces in the names, either. Unfortunately this practice is very common on Mac OS X.

ack

Anyway, here's what I have:

echo '<<<df>>>';
df -kPt hfs | sed -e 's/^\([^ ][^ ]*\) \(.*\)$/\1 hfs \2/' | sed 1d

echo '<<<cpu>>>';
echo `sysctl -n vm.loadavg | tr -d '{}'` `top -l 1 -n 1 | egrep ^Processes: | awk '{print $4"/"$2;}'` `echo 'echo $$' | bash` `sysctl -n hw.ncpu`

echo '<<<uptime>>>';
echo `date +%s` - `sysctl -n kern.boottime | cut -d' ' -f 4,7 | tr ',' '.' | tr -d ' '` | bc

echo '<<<netctr>>>';
date +'%s'; netstat -inb | egrep -v '(^Name|lo|plip)' | grep Link | awk '{ print $1,$7,$5,$6,"0","0","0","0","0",$10,$8,$9,"0","0",$11,"0","0"; }'

as we talked earlier, basicly the same here.

the ipmi command needs a little bit of tweaking, too:

ipmitool sensor list \
        > grep -v 'command failed' \
        > sed 1d \
        > sed -e 's/ *| */|/g' -e "s/ /_/g" -e 's/_*$//' -e 's/|/ /g' \
        > egrep -v '^[^ ]+ na ' \
        > grep -v ' discrete '

the LaunchDaemon script to get check_mk_agent started on OSX is working
fine (on 10+ servers atm)

please find it attached.

bye,

micha

de.jvm.check_mk-agent.plist (827 Bytes)

···

Am 08.09.11 17:40, schrieb Wood, Adam - 1528:

--
Michael Hirdes
Jung von Matt/it-services GmbH
Glash�ttenstra�e 79
20357 Hamburg

Tel: +49 40-4321-1339
Fax: +49 40-4321-1114
E-Mail: michael.hirdes@jvm.de
Internet: http://www.jvm.com

Gesch�ftsf�hrer: Ulrich Pallas, Frank Wilhelm
AG HH HRB 98380

Hi all,

I'm running OMD 0.48, and monitoring a number of Force10 switches, e.g., S4810, S2400, etc

When these devices are walked via SNMP, the CPU is spiking very high on them, triggering all sorts of snmptraps.

Is there a way to mitigate this?

I've added these hosts to my bulkwalk_hosts list in main.mk:

bulkwalk_hosts = [
  ( [ 'force10' ], ALL_HOSTS ),
]

(where all the pertinent switches are tagged with 'force10')

Any help would be greatly appreciated!

Cheers,

Jonathan

I'm going to answer my own question here.

This is a default setting in force10 firmwares up until recently. Basically, when the snmpwalk hits, the CPU spikes to 100 percent for a second or so. It's not dangerous, because the switch will preempt SNMP activity in favor to traffic, if it needs to.

There's a way to alter the trap behavior in newer firmware. To quote the support rep:

"Actually in our newer software by default this threshold alarm is disabled by default and you must enable it with the command:

To turn on.

util-threshold cpu 5sec cp high <value> low <value>

To turn off.

no util-threshold cpu 5sec cp high <value> low <value>

Your are currently on 7.7.1.0 and would need to update to the 8.4.1.3. Other than this I don't know of any other option to change this."

···

On 09/09/2011 01:51 PM, Jonathan Mills wrote:

Hi all,

I'm running OMD 0.48, and monitoring a number of Force10 switches, e.g.,
S4810, S2400, etc

When these devices are walked via SNMP, the CPU is spiking very high on
them, triggering all sorts of snmptraps.

Is there a way to mitigate this?

I've added these hosts to my bulkwalk_hosts list in main.mk:

bulkwalk_hosts = [
( [ 'force10' ], ALL_HOSTS ),
]

(where all the pertinent switches are tagged with 'force10')

Any help would be greatly appreciated!

Cheers,

Jonathan
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

Jonathan ..... how are you?

I noticed that my Enterasys switch line "N", also hit the 100% when
the get the check_mk on all ports.

I'm running version 1.1.11i3. There is the possibility to choose which
interfaces to monitor, I'm using Wato, but he continues to scan all
ports, even those I do not want to monitor.

Which version do you have?

···

2011/9/9 Jonathan Mills <jonmills@gmail.com>:

Hi all,

I'm running OMD 0.48, and monitoring a number of Force10 switches, e.g.,
S4810, S2400, etc

When these devices are walked via SNMP, the CPU is spiking very high on
them, triggering all sorts of snmptraps.

Is there a way to mitigate this?

I've added these hosts to my bulkwalk_hosts list in main.mk:

bulkwalk_hosts = [
( [ 'force10' ], ALL_HOSTS ),
]

(where all the pertinent switches are tagged with 'force10')

Any help would be greatly appreciated!

Cheers,

Jonathan
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

Daniel:

I can't confirm if your issue was at all related to mine, but you might see if there's a threshold you can set in your switch firmware that will tell it not to send an alarm snmptrap unless the CPU stays at 100% for X number of seconds (and set X to whatever you think is a high enough value to avoid the alert, maybe 5 or 10 seconds).

As for the possibility of choosing which interfaces to monitor with Check_MK, this is easy to do.

First you monitor the whole switch as usual via snmp. Then you'll have to basically negate the ports/interfaces you don't want monitored. This is done with an ignored_services section.

Lets say I want to avoid monitoring "Port 15" on all fiber channel switches that I've tagged with 'fiber' in my hosts section:

ignored_services += [
( [ 'fiber' ], ALL_HOSTS, [ 'Port 15' ] ),
]

Here I want to not monitor "Interface TenGigabitEthernet 0/14" on a switch with FQDN r3-02-switch1.example.com:

ignored_services += [
( [ "r3-02-switch1.example.com" ], [ 'Interface TenGigabitEthernet 0/14' ] ),
]

Here i've ignored everything that I use AutoFS with (since it would alarm every time to auto-unmounts from inactivity):

ignored_services += [
# These are autofs mounts and we shouldn't monitor them
( ALL_HOSTS, [ 'NFS mount /home/*' ] ),
( ALL_HOSTS, [ 'NFS mount /work/*' ] ),
( ALL_HOSTS, [ 'NFS mount /projects/*' ] ),
( ALL_HOSTS, [ 'NFS mount /shared/*' ] ),
]

···

On 09/09/2011 05:15 PM, daniel majela wrote:

Jonathan ..... how are you?

I noticed that my Enterasys switch line "N", also hit the 100% when
the get the check_mk on all ports.

I'm running version 1.1.11i3. There is the possibility to choose which
interfaces to monitor, I'm using Wato, but he continues to scan all
ports, even those I do not want to monitor.

Which version do you have?

2011/9/9 Jonathan Mills<jonmills@gmail.com>:

Hi all,

I'm running OMD 0.48, and monitoring a number of Force10 switches, e.g.,
S4810, S2400, etc

When these devices are walked via SNMP, the CPU is spiking very high on
them, triggering all sorts of snmptraps.

Is there a way to mitigate this?

I've added these hosts to my bulkwalk_hosts list in main.mk:

bulkwalk_hosts = [
  ( [ 'force10' ], ALL_HOSTS ),
]

(where all the pertinent switches are tagged with 'force10')

Any help would be greatly appreciated!

Cheers,

Jonathan
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en