KomtGoed
(TheK)
1
Hi,
Installed checkmk this weekend. Still finding my way around but the great documentation helps, thanks!
What I cannot figure out:
How can I create a rule that monitors the SMART parameter powered_on? Would like to receive a notification for drives running for more than x years.
The powered_on hours are displayed for the HDDs in the server in question.
Also managed to find the rule set “SMART ATA (incompatible with legacy plug-in)”. But when adding a rule, I only see the following parameters:
No powered_on… What am I missing?
Thanks in advance!
KomtGoed
(TheK)
2
Solved this with a bash script.
#!/bin/bash
GREP_LINES="$(grep "Powered on:" /data/checkmk/cmk/var/nagios/nagios.log)"
#echo $GREP_LINES
counter=0
while IFS= read -r line; do
#echo "Found: $line"
model="$(echo "$line" | sed -E 's/.*SMART ([-A-Za-z0-9 _]+) Stats.*/\1/g')"
if [[ ! " ${modelNames[*]} " =~ [[:space:]]${model}[[:space:]] ]]; then
counter=$(($counter+1))
modelNames[$counter]="$model"
fi
done <<< "$GREP_LINES"
#echo "count: $counter"
WARNING_AGE=0
WARNING_CYCLES=0
while IFS= read -r line; do
model="$(echo "$line" | sed -E 's/.*SMART ([-A-Za-z0-9 _]+) Stats.*/\1/g')"
counter=0
for modelDummy in "${modelNames[@]}"
do
counter=$(($counter+1))
if [ "$model" == "${modelNames[$counter]}" ] ; then
# echo "$counter: $model"
# echo $line
HOURS="$(echo "$line" | sed -E 's/.*Powered on:[0-9 days]* ([0-9]*) hours.*/\1/g')"
if [ ${#HOURS} -gt 3 ] ; then
HOURS=0
YEARS="$(echo "$line" | sed -E 's/.*Powered on: ([0-9]*) year.*/\1/g')"
else
YEARS=0
fi
DAYS="$(echo "$line" | sed -E 's/.*Powered on:[0-9 years|1 year]* ([0-9]*) day.*/\1/g')"
CYCLES="$(echo "$line" | sed -E 's/.*Power cycles: ([0-9]*),.*/\1/g')"
ALL_YEARS[$counter]=$YEARS
ALL_DAYS[$counter]=$DAYS
ALL_CYCLES[$counter]=$CYCLES
# echo "years: $YEARS"
# echo "days: $DAYS"
# echo "hours: $HOURS"
# echo "cycles: $CYCLES"
if [ $YEARS -gt 4 ] ; then
if [ $DAYS -gt 300 ] ; then
WARNING_AGE=$counter
fi
fi
if [ $CYCLES -gt 1000 ] ; then
WARNING_CYCLES=$counter
fi
fi
done
done <<< "$GREP_LINES"
if [ $WARNING_AGE -gt 0 ] ; then
logger "CDA: HDD headsup"
logger "CDA: Drive ${modelNames[$WARNING_AGE]} running for ${ALL_YEARS[$WARNING_AGE]} years and ${ALL_DAYS[$WARNING_AGE]} days. Power cycles: ${ALL_CYCLES[$WARNING_AGE]}."
fi
if [ $WARNING_CYCLES -gt 0 ] ; then
logger "CDA: HDD/SSD headsup"
logger "CDA: Drive ${modelNames[$WARNING_CYCLES]} running for ${ALL_YEARS[$WARNING_CYCLES]} year(s) and ${ALL_DAYS[$WARNING_CYCLES]} days. Power cycles: ${ALL_CYCLES[$WARNING_CYCLES]}."
fi
if { [ $WARNING_AGE -gt 0 ] || [ $WARNING_CYCLES -gt 0 ] ;} ; then
logger "CDA: All HDD/SSD data just in case:"
counter=0
for modelDummy in "${modelNames[@]}"
do
counter=$(($counter+1))
logger "CDA: ${modelNames[$counter]}: ${ALL_YEARS[$counter]} year(s) ${ALL_DAYS[$counter]} days, power cycles: ${ALL_CYCLES[$counter]}"
done
fi
exit 0
KomtGoed
(TheK)
3
The HOURS line should read as follows
HOURS="$(echo "$line" | sed -E 's/.*Powered on:[0-9 days]* ([0-9]*) hour.*/\1/g')"