Hey all!
Anyone see any weirdness with comware 7.1.045 and not polling information via WATO? Mine sits on stale all the time and provides no historics whatsoever.
I have 1.5.0p22 CRE and the output works perfectly the via CLI. My status are there in 3 seconds.
But try to poll from the GUI and I barely get any updates at all. Basically freezes the GUI and does nothing else, like it’s hung.
Brand new install and I usually do not monitoring comware at all.
Just seeing what I might do to get this working better.
Thanks!
Ok, how about advice on increasing timeouts? I know the list has advice on SNMP timeouts and I have read them, however there are quite a few settings that I’ve tried and I’m not sure what ones are truly the ones I should pay attention to.
1. I AM able to successfully poll these switches and their interfaces
2. They work fine when they are not stacked - they also work fine when they ARE stacked but take longer to poll
3. Stacked switches take considerably longer and I have successfully added the exact same switches, software revisions, etc., in the same monitoring server but with 1-4 stack members and no issues are present
4. The stack I’m having issues with is likely because it’s 7 members and takes awhile to poll - that seems to be why I don’t get any results after adding the host
All that said, the stats are as follows:
OMD[site]:~$ time cmk -IIv hs-mdf-1
Discovering services on: hs-mdf-1
hs-mdf-1:
+ FETCHING DATA
[snmp] Execute data source
[piggyback] Execute data source
No piggyback files for 'hs-mdf-1'. Skip processing.
No piggyback files for '192.168.101.27'. Skip processing.
+ EXECUTING DISCOVERY PLUGINS (6)
208 if64
1 snmp_info
1 snmp_uptime
SUCCESS - Found 210 services, no host labels
real 1m25.880s
user 0m1.614s
sys 0m0.670s
and then
OMD[site]:~$ time cmk -n hs-mdf-1
OK - [snmp] Success, execution time 90.8 sec | execution_time=90.825 user_time=1.410 system_time=0.090 children_user_time=0.180 children_system_time=0.200 cmk_time_snmp=88.942 cmk_time_agent=-0.008
real 1m32.197s
user 0m2.794s
sys 0m0.454s
Given those stats, what would you guys suggest doing next to increase SNMP timeouts?
I have made changes to checks and timeouts but still don’t seem to have the right combination. Anyone familiar with how to calculate the timeouts and put it into the proper rules?
Just looking for clear rules to use instead of me taking guesses. I’ve played with the following but probably need the right combination:
I’m on 1.6.0p3 CRE
Change snmp_tineout from 60 to 180 in: /opt/omd/versions/1.6.0p3.cre/lib/python/cmk/gui/plugins/wato/check_mk_configuration.py
Rebooted server after changes
I’ve tried modifying these values:
Check intervals for SNMP checks (3 minutes for only this host)
Normal check interval for host checks (3 minutes for only this host)
Normal check interval for service checks (Check_MK service, 3 minutes)
Timing settings for SNMP access (response timeout for a single query 180 seconds with 2 retries)
Just wondering if anyone has advice on proper settings to increase timeouts so they work so I can at least get some stats from the interfaces. How would you guys configure it, if you have experience with long polling times for stacks or just slow SNMP overall?
Thanks
···
On Oct 4, 2019, 2:50 PM -0500, Brian Binder via checkmk-en <checkmk-en@lists.mathias-kettner.de>, wrote:
Hey all!
Anyone see any weirdness with comware 7.1.045 and not polling information via WATO? Mine sits on stale all the time and provides no historics whatsoever.
I have 1.5.0p22 CRE and the output works perfectly the via CLI. My status are there in 3 seconds.
But try to poll from the GUI and I barely get any updates at all. Basically freezes the GUI and does nothing else, like it’s hung.
Brand new install and I usually do not monitoring comware at all.
Just seeing what I might do to get this working better.
Thanks!
_______________________________________________
checkmk-en mailing list
checkmk-en@lists.mathias-kettner.de
Manage your subscription or unsubscribe
https://lists.mathias-kettner.de/cgi-bin/mailman/listinfo/checkmk-en