Hi Matthew,
yes the “Check_MK” service is the only active service pulling all the needed data from the target device. The shown time is the complete runtime to pull the data.
I can remember that there where a discussion so days/weeks before regarding the long runtime of interface checks on Juniper devices.
http://lists.mathias-kettner.de/pipermail/checkmk-en/2015-June/015902.html
The result of this discussion was that Juniper has some stupid SNMP implementation and it is nearly impossible to monitor the complete switch if there are more than some ports on the switch.
Best regards
Andreas
···
Matthew Nickerson mnickers@cs.cmu.edu schrieb am Mo., 9. Nov. 2015 um 20:01 Uhr:
Can someone also please explain what the check called “check_mk”
actually does. I know this sounds so stupid, but I can’t figure it
out. I assume it is a count of how long it takes to complete all of the
checks assigned to a given host, maybe? I do know this, when interface
monitoring is on, the execution time of this process is increased on
average by about 3.5 times on every device. Take a look at the attached
photo, of the check_mk execution time. Ythe time when I disabled
interface monitoring. Very significant difference. I’m trying to figure
out how these two things tie together to troubleshoot this.
Thanks in advance all!
On 11/9/2015 1:28 PM, Matthew Nickerson wrote:
So, as Lance pointed out to me once, I know there was a big discussion
that took place just before I joined the list. I’ve read through it,
but I’m wondering if there have been any developments. This
conversation was in regard to SNMP checks, timeouts, and CPU usage on
Juniper devices with Check_MK.
As it stands, now, when I enable interface monitoring (evenly solely
on my critical links) the CPU usage goes through the roof and timouts
and retries happen all over the place. I have set all of the timers
various SNMP checks super low, and have tweaked my SNMP timout and
retry values, (currently at 7 seconds with 5 second retry) but it
isn’t helping much
Let me also say that we are currently using Zenoss AND Infoblox
NetMRI, both of which are monitoring these Juniper Switches. Further,
when they are turned on they are monitoring EVERY single interface on
every juniper switch at 10 minute intervals and they do not cause the
above mentioned problems on the switches. I have tried turning them
off, and in fact, have them turned off completely on the problem
switches and still all of the issues appear when check_mk is
monitoring even just a handful of interfaces… This leads me to
believe that, indeed there may be something wrong with the juniper
checks.
Just trying to get some input, because as it stands I won’t be able to
use check_mk, unfortunately, because I absolutely love it and want to
get rid everything else!
Help!
–
Matthew Nickerson
Network Engineer
Computing Facilities, SCS
Carnegie Mellon University
(412) 268-7273
checkmk-en mailing list