Per process CPU monitoring

Hey @openmindz thanks for your reply! Yep, using “%s” is more the kind of thing I was after all along, but as you say it does mean a new CMK service for every process on each server. I think a combined graph may well not prove useful as you say, although I have seen some graphing tools that do allow you to click/select items on the graph and filter which may have been of help for this. I’m also concerned with the performance aspect, although I’m not sure of how severe the detrimental effect would actually be (yet). It’s very likely I will come back and test this though!

I think your idea of a local check that grabs the top 5 processes and sends them back would be better, although I’m not sure how this would work if one of the processes being graphed then drops out of this top 5? I would assume it would be removed from any graph and we then wouldn’t see it at that point. I’ll take a look at a rule per service to ensure the HTML table is visible…

@rawiriblundell couldn’t agree more with this - I expect that if this is a CPU issue (and further investigation today has pretty much confirmed this) then the ability to monitor (or for the CMK agent to respond in a timely fashion) when CPU is maxed will indeed be impaired.

I haven’t seen monit before, but it does look useful.

I have managed to access an instance today that had 2 of 4 CPU cores maxed out and have pinpointed the process. I’m now working with the 3rd party vendor of the application (Anti Malware - no surprises) to investigate the root cause. While I can’t actually recreate the issue, I’m now able to identify (with a strong likelihood) when and on which instances it is going to occur. Restarting the problematic service resolves the issue instantly.

Thanks to everyone on this thread for all your advice/opinions. Glad to see the community here is thriving and there are plenty of like minded people wiling to help each other out. I’ve also managed to learn a thing or two along the way as well :slight_smile:

James