Check Mk Discovery failed check_mk RAW 1.6011

Hi. I have used check_mk about a year now, upgrading from check_mk raw 1.5 to check_mk 1.6.
I have some problem. I have a few hosts (using SNMP) that have service error “check_mk discovery” with error “Service time out”.
I noticed that this happen only to some host with long SNMP response. How do i solve or ignore this error?
I have tried to use the rule check_mk " Status of the Check_MK services", but it mention the rule does not work for the timeout. (only work for micro checkMk only).
is there any roundabout method to ignore this error?
If possible i would like not to completely remove this service.

First step to solve such problem is on the command line.

  • cmk --debug -vvI hostname
    measure the time needed to complete the discovery and take a look where the device is very slow in response. You see the checks one after another and the fetched OID’s.
  • if you identified some checks where your device don’t behave you can use the ruleset “Disable checks” to completely remove these checks from discovery
  • one other point you can control is - how does your device react if you change from bulkwalk to bulk or vice versa
1 Like

Hi. thanks for the response.
I tried the first one that you suggest cmk --debug -vvl hostname, and get no output. it just finish and that its. it took about 1 s to finish.
for the 2nd point, you meant to disable any check that might cause a problem correct?how do i know which check have this problems?
for the third point, how do i change from bulkwalk to bulk and vice versa? i already change the “Bulk walk: Number of OIDs per bulk” to 50 for this particular host.

btw, for your info, for this host, when using the command line snmpwalk, i get error, “tooBig”

If this happens something in your configuration is wrong.
What do you see at a cmk -D hostname ?

If you get nothing at the first command then all other things are useless.
Also if you get error messages or some other output it would be nice to see this output.

if i did cmk -D hostname, it produce the detail of the host, and all the monitored service.

Addresses: XXXXXXX
Tags: [address_family:ip-v4-only], [agent:no-agent], [criticality:prod], [ip-v4:ip-v4], [location:XXXX], [networking:wan], [piggyback:no-piggyback], [site:XXXX], [snmp:snmp], [snmp_ds:snmp-v2], [state:XXXX]
Labels: [state:XXXX]
Host groups: TOR_Sw, DRC
Contact groups: operator, all, check-mk-notify
Agent mode: No agent
Type of agent:
Management board - SNMP (Credentials: ‘authPriv, sha, XXX, XXXXX, AES, XXXX’, Bulk walk: yes, Port: 161, Inline: no)
SNMP (Credentials: ‘authPriv, sha, XXX, XXXX, AES, XXXX’, Bulk walk: yes, Port: 161, Inline: no)
Services:
checktype item params description groups


if 00001 {‘state’: [u’1’], ‘errors’: (0.01, 0.1), ‘speed’: None} Interface 00001 Interface
if 00002 {‘state’: [u’1’], ‘errors’: (0.01, 0.1), ‘speed’: None} Interface 00002 Interface

(ps: i have change all the sensitive info with XXXX)

If i use the service discovery for that single host, it will mention wrong snmp. but during configuration of the host, when i clicked “save and test”, it mention snmp work.
As mention before, if manually use snmpwalk, it will be able to read the oid, but then stop with error tooBig.

Previously, i used bulk discovery to detect all service for this host, as it won’t effect by the normal timeout (read this is one of the cmk forum/listing). Maybe I have to increase the timeout for snmp for that particular host?
I fill in all the cmk host using te GUI.

i have tried the first command with a working OK host, bu don’t get any result as well.
BTW i thought that command should only give output if there is no output?

Please remove the configuration for the management board. I think the error message comes from there.

And again the cmk --debug -vvI hostname must give you an output.
Was the command issued from the right slave?

Hi Andreas,

you are correct, the error come from the configuration for the management board. it seem like the password i used for snmp v3 authpriv is inccorect ( i finally know this from the command line cmk -D hostname).
btw, you are correct, the cmk --debug -vvl does output something. I mistake the letter I and L.

Thanks for you help

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.