Checkmk for monitoring 5000 OpenWrt router/access points - advice needed

We have a project providing Internet via wifi hotspots running on mainly consumer routers & access points - all with OpenWrt firmware for 1000s of refugee camp residents. We are looking for an (open source, preferably) solution for monitoring the uptime & ‘WAN’ bandwidth of the access points. The current number is around 5000 units.

Due to the projects limited finances & large number of hosts - we are looking to evaluate the Checkmk Raw Edition.

As the metrics required are very limited - i.e. just uptime & overall Internet use/bandwidth - would a single 16GB RAM/i5 or i7 CPU server be sufficient for checkmk? Would the Raw edition handle 3-5000 hosts? Also - what would be the best method for monitoring - SNMP or the OpenWrt agent? Any other advice for our intended use? - Thanks.

Easily. With Checkmk it’s about the “services” more so than the “host” count. Remember, you don’t have to “machine gun” query your appliances. You’ll get meaningful real stats with even a longer rotation across the hosts. Since you’re talking raw, that means you don’t necessarily even have to stick to the “1 minute” interval between checks, you could go longer. That will be less “rough” on the endpoints (which btw, is more likely where the problem will be).

I’m currently monitoring 11,000+ services and some of those are pretty weighty. For some SNMP hosts, I do limit my rate (e.g. 2 minutes even 5 minutes), because again, it’s rough on a lot of those hosts, their processors weren’t meant to be pounded on (talking switches and such). Edit: One of those is a large switch stack with almost a 1000 interfaces, takes time to come back and it’s CPU gets hit hard.

Sometimes there are checks that can integrate to get “more” (more services monitored) when using a specific API (custom), but without that, unless you want to write the check yourself, you’ll probably use SNMP.

Thanks for the reply & reassurance. I see there is a specific OpenWrt agent - all of our access points have been flashed with OpenWrt firmware & the core switches are Mikrotik. Is the Checkmk agent lighter on resources vs SNMP?

Hard to say. If there’s an agent, it’s usually for the purpose of producing more and better data. Part of that “more” might mean that in terms of load. Overall, checkmk agent’s are very light on the enterprise side. Heavier on the free/raw side. Edit: Agent is same on both enterprise and raw. So, fairly light either way agent wise.

Hi @cjcox:
the agent for RAW and enterprise is the same. In code and in weight. :wink:

Hi @adamyaqub
On servers, you should always prefer the agent over SNMP. Because it’s faster, more lightweight and produces more insights than any SNMP stack.
As we are talking here about OpenWRT routers it’s a bit harder to say which will behave more lightweight. I used both on OpenWRT devices, and both worked smoothly for me. But your mileage may vary, dependent on the underlying hardware. I would give both a try.

Regarding RAW vs. Enterprise: The main burden is on the monitoring server side. The Nagios core of the RAW edition is simply not so efficient as the Checkmk microcore (cmc). Monitoring thousands of hosts with a hundred thousand services in one minute intervals will not scale well with the RAW edition.

HTH,
Alex

1 Like

We’ll start rolling out a test on a few access points using the agent next week & see how it goes. Thanks for all the input.

1 Like

Hallo,
contact cmk direct.
When covid appears they published a version for hospitals.
Perhaps …….
Ralf

I don’t know how OpenWrt agent works but bear in mind that when you generally disable a service in Checkmk you don’t do that on the collecting end, its mostly in the GUI.

For example if you don’t want to see CPU load on a server you can disable that in the GUI, but the Agent will still collect the CPU load, create RRD files for it etc, it’s just you who does not see it.

In SNMP you can disable sections, so there is a way to say I only want to have networking interfaces (for example) - But I think you will still get all interfaces (LAN, Wi-Fi, WAN etc)

The enterprise edition is better suited for larger installations as the microcroe have done a lot of improvements. But on the other hand I understand that cost is challenging.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.