VMware Cluster Resources (Mem & CPU)

Howdy folks,

I’ve got an issue and have searched and read the forums but was unable to find a resolution. Hoping some can assist.

CMK version:

OS version:
RHEL 8.5 on Checkmk server
vSphere/ESXi 6.7 Monitored Hosts

Error message:
No Error message…just the following issue.

I am monitoring just some basic info from VMware vSphere 6.7…only datastore usage and cluster CPU & memory usage.

I created Folders under Main for “vCenters” and the “ESXi Hosts”.

Under “ESXI Hosts”, I created a folder for “Clusters”

I entered the vCenter as a monitored host in the “vCenters” folder and configured “Use piggyback data from other hosts if present”.

I also entered all the ESXi hosts as monitored hosts in the “ESXi Hosts” folder with a host tag of the VMware cluster they belong to.

I then created a cluster host in the “Clusters” folder which contained all of the ESXi hosts.

I created Clustered Service Rules. One for “CPU” and one for “Memory” and set the explicit conditions to apply to the “ESXi Hosts” folder and the custom tag of the correct VMware cluster name.

Lastly, I selected all the hosts in the “ESXi Hosts” folder and selected “Discover services” and applied the changes.

The Problem:
Now when I select the cluster host from “Monitor” it only has the PING service listed. When I select Host->Service Configuration, it shows me the CPU & Memory services with correct data and the inline help says “These services had been found by a discovery and are currently configured to be monitored.”

But even under the list of all services, these services are not shown.

Did I miss a step?

I can provide further configuration information if it would be helpful.

Thanks in advance for any help.


Just thinking out loud, but I think you didn’t follow:

And tried to do something you thought would work? Maybe I didn’t read all of what you said correctly.

I think here is a misunderstanding what the clustering of services does.
You cannot use the clustering of services to aggregate the values of the single services.
What you see inside your service discovery would be a random memory and cpu service from one of the ESXi hosts. But not the aggregated usage.

To gain an overview of all the memory and cpu usage of an ESXi cluster i would do the following.

Create the vCenter and all the ESXi hosts like you have done it.
Only configure the vCenter to be queried by the special agent. All the hosts should use the piggyback data from the vCenter.
Now you should have all the hosts with there own memory and cpu checks.
Next step is create a service search. As filter you use the service description for the memory and cpu combined with a pipe like “memory|cpu” and the custom tag for one of your clusters.
The result should be a table with all the cpu and memory checks from one of the clusters. Save it as bookmark and modify the search options for the next cluster. In the end you will have some bookmarks for every cluster one.

Thanks, cjcox.

I did follow that.

There are no mentions of monitoring cluster resources though, and, for us at least, that’s a pretty critical feature to have. VMware balances well enough for us so that individual host resources are not as important as the resources of the cluster as a whole.

Thanks, andreas-doehler.

But it appears for the special VMware Agent, it does aggregate the values of single services?

Here is what VMware reports for the cluster as a whole.
VMware Cluster

Then, even though I can not get these services to show up on the host, here is what CMK shows when I go to the service configuration for the cluster host.

Those numbers seem to match up pretty perfectly.

Now maybe it is still some issue of my misunderstanding, but for every other service, once it was a “monitored service” it was available under the host in the monitoring dashboard.

Please let me know if am barking up the wrong tree and/or someone thinks I may have missed a step.

Thanks all.

Ok, it looks like the VMWare CPU and Memory checks are cluster aware.
I need to have a look at one of my systems to verify if there is a problem.
Your approach should normally work.

Thanks, andreas-doehler.

I appreciate the assistance.

Hello, can you please write how did you do it ? There is nothing in Checkmk documentation.


So it looks like you replied to me, but quoted Andreas, so I’m not sure if you were asking me how I got to the point I am at…but here goes anyway.

Most of it is as described above and in the document that cjcox linked.

The quick rundown is that I put my vCenters in a folder and ESXi hosts in a folder.

Then, I went to Setup>Agens>VM, Cloud, Container>VMware ESXi via vSphere.

There, I created 2 rules. One to login to my vCenters and applied it to the vcenter folder and the other to login to my ESXi hosts and applied it to their folder. I played around with the combination of data to retrieve from each until I had the data I was looking for. YMMV

Next, I went to Setup>Service monitoring rules>Clustered Services and created rules for the CPU and Mem and applied it to the ESXi hosts with a label for the cluster they belonged to.

I created a clustered host for each cluster and performed a discovery on all the ESXI hosts and Cluster hosts.

@andreas-doehler Any luck recreating what I am seeing?

Bumping for visibility.

As an update.

I just finished updating to check-mk-raw-2.1.0p11.

I re-ran the discovery on the ESXi hosts and the cluster host.

I also tried deleting and re-configuring the clustered host.

No luck resolving this issue.

Bumping for visibility.

10-18-2022 - Bumping for joy!

Hi jd2022, this is a forum with free support :wink: bumping issues is usually frowned upon. If people have time and interest in your issue, they’ll help. If not, please try troubleshooting yourself or raise the issue with the paid checkmk support.

My apologies.

Any future posts of mine will be 100% less bumpy.

I’ll keep my fingers crossed for a resolution before I turn to writing a specific check for cluster resources.

Thanks for the gentle assistance.

OK, folks…in case anyone runs into this in the future.

Cluster resources(at least CPU & MEM) do work.

I think my issue was that I had created a special agent for vCenters AND the ESXi hosts with piggybacking enabled.

The ESXi hosts only need to be “normal” hosts with all of their checks brought in via the vCenter special agent and passed along via piggybacking.

Once I removed the “extra” special agent, the procedure I outlined above does indeed work for clustering metrics from the cluster.

Thanks for your patience with my poor etiquette as I worked through this.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.