[Check_mk (english)] Check_MK, SNMP, Juniper, Help!

Hi Matthew,

yes the “Check_MK” service is the only active service pulling all the needed data from the target device. The shown time is the complete runtime to pull the data.

I can remember that there where a discussion so days/weeks before regarding the long runtime of interface checks on Juniper devices.

http://lists.mathias-kettner.de/pipermail/checkmk-en/2015-June/015902.html

The result of this discussion was that Juniper has some stupid SNMP implementation and it is nearly impossible to monitor the complete switch if there are more than some ports on the switch.

Best regards

Andreas

···

Matthew Nickerson mnickers@cs.cmu.edu schrieb am Mo., 9. Nov. 2015 um 20:01 Uhr:

Can someone also please explain what the check called “check_mk”

actually does. I know this sounds so stupid, but I can’t figure it

out. I assume it is a count of how long it takes to complete all of the

checks assigned to a given host, maybe? I do know this, when interface

monitoring is on, the execution time of this process is increased on

average by about 3.5 times on every device. Take a look at the attached

photo, of the check_mk execution time. Ythe time when I disabled

interface monitoring. Very significant difference. I’m trying to figure

out how these two things tie together to troubleshoot this.

Thanks in advance all!

On 11/9/2015 1:28 PM, Matthew Nickerson wrote:

So, as Lance pointed out to me once, I know there was a big discussion

that took place just before I joined the list. I’ve read through it,

but I’m wondering if there have been any developments. This

conversation was in regard to SNMP checks, timeouts, and CPU usage on

Juniper devices with Check_MK.

As it stands, now, when I enable interface monitoring (evenly solely

on my critical links) the CPU usage goes through the roof and timouts

and retries happen all over the place. I have set all of the timers

various SNMP checks super low, and have tweaked my SNMP timout and

retry values, (currently at 7 seconds with 5 second retry) but it

isn’t helping much

Let me also say that we are currently using Zenoss AND Infoblox

NetMRI, both of which are monitoring these Juniper Switches. Further,

when they are turned on they are monitoring EVERY single interface on

every juniper switch at 10 minute intervals and they do not cause the

above mentioned problems on the switches. I have tried turning them

off, and in fact, have them turned off completely on the problem

switches and still all of the issues appear when check_mk is

monitoring even just a handful of interfaces… This leads me to

believe that, indeed there may be something wrong with the juniper

checks.

Just trying to get some input, because as it stands I won’t be able to

use check_mk, unfortunately, because I absolutely love it and want to

get rid everything else!

Help!

Matthew Nickerson

Network Engineer

Computing Facilities, SCS

Carnegie Mellon University

(412) 268-7273


checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en