[Check_mk (english)] bug solaris NFS hanging results in statgrab hanging results in check_mk_agent hangin [suggested fix]

Divan_Santana · August 19, 2010, 9:06pm

Hi,

I noticed a bug on Solaris servers. When a NFS mount point is stale it
hangs statgrab which therefore hangs the Check_MK agent and service.

This is obviously quite problematic.

The suggested fix I would like to recommend is change the agent from:

if statgrab > /tmp/statgrab.$$ 2>/dev/null

to:
if statgrab const. cpu. disk. general. load. mem. net. page. proc. swap. user. > /tmp/statgrab.$$ 2>/dev/null

The above basically does a statgrab on all except the file systems (fs.) as that results in a hang.

Could the above suggestion be included in future releases?

PS. I've managed to monitor NFS mount points on solaris by using MRPE with the check_nfs_health.sh nagios plugin.

Thanks,
Divan

Mathias_Kettner · September 17, 2010, 3:24pm

Hi Divan,

I've added your change into the official version (GIT). Thanks. Could
you check this out sometime...

Greetings,

Mathias

···

Am 19.08.2010 23:06, schrieb Divan Santana:

Hi,

I noticed a bug on Solaris servers. When a NFS mount point is stale it
hangs statgrab which therefore hangs the Check_MK agent and service.

This is obviously quite problematic.

The suggested fix I would like to recommend is change the agent from:

if statgrab> /tmp/statgrab.$$ 2>/dev/null

to:
if statgrab const. cpu. disk. general. load. mem. net. page. proc. swap. user.> /tmp/statgrab.$$ 2>/dev/null

The above basically does a statgrab on all except the file systems (fs.) as that results in a hang.

Could the above suggestion be included in future releases?

PS. I've managed to monitor NFS mount points on solaris by using MRPE with the check_nfs_health.sh nagios plugin.

Thanks,
Divan
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en