Monitoring of VSAN Cluster Overall health and Network Health?

Hi there,

Situation:
I want to monitor the overall health of my vSAN Clusters and additionally retrieve the following data:

  • vSAN Network Health
  • vSAN Data Health
  • vSAN Cluster Capacity

I am aware that vSphere offer the following capabilities to monitor the data as outlined above:

I am also aware that CheckMK has the VMWARE ESX via vSphere rule which I have implemented, which is allowing me to monitor the ESXi Hosts within the vSAN Cluster.

Help Required:
Is anyone currently monitoring there vSphere VSAN through checkMK, and what method did you use to pull data into CheckMK?

https://exchange.checkmk.com/p/vsan

Regards,
ttr

2 Likes

Hi @ttr,

is cmk v2 supported?

BG

Thanks @ttr , this is what I was looking for :slight_smile:

A question around the installation of this mkp file, as it will be first that I will be doing so. As outlined in relevant documentation I have to do a ‘mkp install vscan-0.3d.mkp’ on the checkMK server. Once the change is activated, what’s the next additional configuration required in order for the vsan cluster to start being monitored?

Thanks,
Arj

@dns_es it says on the exchange page that minimum checkmk version required is 1.6.0, im hoping that means that it is 2.0 compatible

Sorry there will be normally no package what is working in 1.6 and 2.0 without modifications.
That is very rare that this happens.

  • Create some user in vSphere and give readonly permission
  • Configure WATO rule “Check VMware VSAN” for your vCenter host
  • reinventarize vCenter. If nothing found, wait some time and retry. VSAN API is slow as hell and query might take more than one minute on large clusters

Regards,
ttr

All my infrastructure instances are on 1.6, so I can’t tell. Maybe it works on 2.0. Give it a try and tell me.

Regards,
ttr

If I am already monitoring the vcenter with cmk´s standard special agent, can I use your vsan-check in parallel? or could this overwhelm the vcenter?

At least this is what I do.

Probably not. Almost anything within vSphere is nothing but an API call, so vCenter should be able to answer to just some info queries very easily. This is what it’s made for. And due to the very slow VSAN API (which is different from vSphere API) the special agent is caching itself and will start a query every 15mins only (if left on defaults) which reduces overhead even more.

Regards,
ttr

1 Like

all right, i will give it a try…

hi,

inventory looks good but pendingstate never ends and my services monitored with cmk´s standard special agent changed now to unknown… seems like there are problems with parallel checks?

Hi,

@dns_es, we’ve also seen this issue on very large clusters, where API queries run for some minutes. The special agent should actually avoid a double run, but this does not work correctly in 0.3x. This issue is fixed in version 0.4 of the package, which was uploaded to exchange yesterday and is now under review.

You should disable the special agent and clean all temporary files in $OMD_ROOT/tmp: agent_vsan., vsan-. If you cant await review I can provide the new package.

Regards,

1 Like

ok good to hear. i will wait untill review is done but thanks for your offer :slight_smile: