Hi to all, i am running RAW edition on a lan network consisting of about 30 switches (hp & unifi) connected in mixed mode… (some in star and some other in series)…
I have almost every 5 minutes alerts of hosts down which of course are not down… So i wonder what should be a normal value for rta… Should i alter threshold values? This is a hotel network and on this period (closed) traffic is very low and most of the switches work near to 0% cpu load…
the graphs of the last 4 hours and before 30m the last down message…
Notice that check mk has two NICs, one on management network (same lan with switches) and the other from the network that users login to GUI…
P.S. there are other switches with longer rta time up to 40-50ms on the same lan… the max depth of serial connected switches is 5… On another site with same setup (different location) these statistics are much better with no timeouts (other checkmk server)
FYI my server runs on ubuntu 22.04 on a Hyper-V VM and the load of the appliance is pretty low
This is no rta or load problem. You have packet loss from time to time. It looks really like a networking problem. Where this problem is situated is not possible to say without deeper troubleshooting.
You can only look if you see a type of pattern for the devices with this packet loss problem.
I’ve had instances where I simply ping “beyond the switch” instead of using the SVI or management interface on the switch.
So switch1 is 1.1.1.1 but the server plugged in to it is 1.1.1.2
No packet loss on the server whatsoever and faster RTT but the switch still shows packet loss data, etc.
The reasoning for me when speaking with vendors like cisco was because they never will prioritize ICMP when other traffic needs to be processed. Many times there’s not an issue on the switch, it’s just not going to respond quickly like a server NIC was made to.
Same goes for IPMI interfaces. They’re there…and they respond, but are slow, so even if you lose packets from time-to-time, they’re not as robust as a server NIC in many cases, so it’s sometimes quite cosmetic.
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.