I have encountered a situation in checkmk server, the issue in short is that checkmk does not read custom OIDs.
I have a server and I am trying to monitor it using SNMP to monitor sensors temp, fan speed, etc. When I test with simple snmpwalk command I find out there is custom oid so I reached to the manufacturer and they gave me the MIB files and everything was fine.
When I added the host to checkmk I only saw 5 metrics, after troubleshooting I finds out that only the generic OIDs were discovered. So I start searching how to add the rest and did not seems to find anything of help.
My question is how to resolve this issue what is the correct way? I understand that checkmk would not understand special OIDs but everything is defined in it threshold and everything. On other NMS I would just upload the MIB files the OIDs is structured in the stander of SNMP trees.
I know about includes monitoring through agent which pulls ipmitool sensor, it is no good for me it is very limited I need each sensor to be its own service.
Checkmk needs to know how to handle custom OIDs. There is no way this can be achieved by uploading a MIB file as they only translate the numbers to text. They have no machine readable semantics in them.
Sorry to nitpick here, but that’s not quite correct. There is libsmi, a C library I contributed to over 25 years ago when I was at uni, that does exactly this: parse MIBs & PIBs (Policy Information Bases) & offer their information as structured information to the calling application. You can use that information to do general processing. For example, a generic method of handling arbitrary SNMP OID values could:
read the corresponding MIBs
get the OID’s name & description from it
use the name as the service name (e.g. SNMP OID %s with %s being the name from the MIB)
based on its data type offer rules that let you specify upper/lower bounds
A rule configuring such a thing would consist of a list of OIDs and their corresponding metric names (these would have to be done by the admin). The configured OIDs could simply be added to the general big list of OIDs to retrieve anyway, as CheckMK does with all the “real” SNMP plugins.
Ideally the process would be:
upload MIB once (functionality is already present)
create “Arbitrary SNMP OIDs” rule
in that rule be able to select a corresponding MIB
after selection add an arbitrary number of OIDs from the prior selected MIB & provide a unit name for each metric (with several common ones pre-defined, e.g. “V” etc.)
That would go a long way to alleviate the need to write your own plugins for each and every unsupported OID.
IMHO this would put all the work into the configuration which would be better placed in the implementation.
When this is formalized in the implementation the user does not have to do anything. SNMP mostly works automatically in Checkmk.
A tool could be built that reads a MIB file and generates stub code for an SNMP check plugin. But the check logic itself would have to be implemented. It is not always just a metric that gets read but often a status (which is often also an integer in SNMP) or even a combination of several OIDs.
True, but it would empower sysadmins who aren’t developers to monitor their devices. At the moment those people are simply shit out of luck, given how many devices are out there that CheckMK doesn’t support out of the box and for which there are no third-party plugins available.
I mean… us two and several others can easily whip up a plugin for that, but even then it takes us quite a bit of time. And sometimes it’s just one or two OIDs you want to monitor. In such cases even I’d probably prefer doing a quick configuration change over implementing a whole new plugin.
Thank you @r.sander , I understand your point of view. However, I would like to use an existing plugin instead of building one. I work in implementation, so I’d rather not have lots of code to maintain.
Also, I might add that SNMP is a dying protocol, in my humble opinion. I don’t use it for most of my metrics anyway. In this particular IPMI case, I used Redfish to expose the data since most new motherboards support it. And with the Checkmk Redfish plugin, it worked like magic — it only took two minutes.
P.s Thank you @mbunkus for sharing your thoughts — it was an insightful conversation.