CMK version:
Checkmk Raw Edition 2.3.0p18
OS version:
Rocky Linux 9.6
Kernel: Linux 5.14.0-503.35.1.el9_5.x86_64
Hi everyone
I’m relatively new to Checkmk and trying to monitor my MikroTik switch in more detail. I wrote a small Python script based on this wiki link and some help from ChatGPT.
Unfortunately, neither I nor ChatGPT have been able to resolve the issue I’m running into. My background is more in system engineering than programming.
What I’ve Tried So Far
- Regular SNMP walk/SNMP walk using cmk:
However, I’m stuck at the following stage:
When I run:
cmk --debug -vvv --no-cache --detect-plugins=vega_hwstatus hostname
The following happens:
Error message:
OMD[monitoring_site]:~/local/lib/check_mk/base/plugins/agent_based$ cmk --debug -vvv --no-cache --detect-plugins=vega_hwstatus <HOSTNAME>
value store: synchronizing
Trying to acquire lock on /omd/sites/monitoring_site/tmp/check_mk/counters/<HOSTNAME>
Got lock on /omd/sites/monitoring_site/tmp/check_mk/counters/<HOSTNAME>
value store: loading from disk
Releasing lock on /omd/sites/monitoring_site/tmp/check_mk/counters/<HOSTNAME>
Released lock on /omd/sites/monitoring_site/tmp/check_mk/counters/<HOSTNAME>
Checkmk version 2.3.0p18
+ FETCHING DATA
Source: SourceInfo(hostname='<HOSTNAME>', ipaddress='<IP>', ident='snmp', fetcher_type=<FetcherType.SNMP: 7>, source_type=<SourceType.HOST: 1>)
[cpu_tracking] Start [address]
Read from cache: SNMPFileCache(<HOSTNAME>, path_template=/omd/sites/monitoring_site/tmp/check_mk/data_source_cache/snmp/{mode}/{hostname}, max_age=MaxAge(checking=0, discovery=90.0, inventory=90.0), simulation=False, use_only_cache=False, file_cache_mode=1)
[cpu_tracking] Stop [address]
Source: SourceInfo(hostname='<HOSTNAME>', ipaddress='<IP>', ident='piggyback', fetcher_type=<FetcherType.PIGGYBACK: 4>, source_type=<SourceType.HOST: 1>)
[cpu_tracking] Start [address]
Read from cache: NoCache(<HOSTNAME>, path_template=/dev/null, max_age=MaxAge(checking=0.0, discovery=0.0, inventory=0.0), simulation=False, use_only_cache=False, file_cache_mode=1)
No piggyback files for '<HOSTNAME>'. Skip processing.
No piggyback files for '<IP>'. Skip processing.
Get piggybacked data
[cpu_tracking] Stop [address]
[cpu_tracking] Start [address]
+ PARSE FETCHER RESULTS
HostKey(hostname='<HOSTNAME>', source_type=<SourceType.HOST: 1>) -> Add sections: []
HostKey(hostname='<HOSTNAME>', source_type=<SourceType.HOST: 1>) -> Add sections: []
Received no piggyback data
No piggyback files for '<HOSTNAME>'. Skip processing.
No piggyback files for '<IP>'. Skip processing.
[cpu_tracking] Stop [address]
value store: synchronizing
Trying to acquire lock on /omd/sites/monitoring_site/tmp/check_mk/counters/<HOSTNAME>
Got lock on /omd/sites/monitoring_site/tmp/check_mk/counters/<HOSTNAME>
value store: already loaded
Releasing lock on /omd/sites/monitoring_site/tmp/check_mk/counters/<HOSTNAME>
Released lock on /omd/sites/monitoring_site/tmp/check_mk/counters/<HOSTNAME>
[snmp] Success, [piggyback] Success (but no data found for this host), execution time 0.0 sec | execution_time=0.000 user_time=0.000 system_time=0.000 children_user_time=0.000 children_system_time=0.000 cmk_time_snmp=0.000 cmk_time_agent=0.000
My Script:
#!/usr/bin/env python3
from cmk.agent_based.v2 import (
CheckPlugin,
startswith,
Result,
Service,
SimpleSNMPSection,
SNMPTree,
State,
exists,
)
def parse_vega_hwstatus(string_table):
import sys
print("DEBUG: parse_vega_hwstatus got string_table:", string_table, file=sys.stderr)
# Maps for labels and values by index
label_map = {}
value_map = {}
for row in string_table:
for oid, val in row:
if oid.startswith('.1.3.6.1.4.1.14988.1.1.3.100.1.2.'):
index = oid.split('.')[-1]
label_map[index] = val
elif oid.startswith('.1.3.6.1.4.1.14988.1.1.3.100.1.3.'):
index = oid.split('.')[-1]
value_map[index] = val
# Build sensor dict: {label: value}
sensor_map = {}
for index, label in label_map.items():
if index in value_map:
try:
sensor_map[label] = float(value_map[index])
except Exception:
pass # skip if value isn't a number
return sensor_map
# Fetch both the label and value OID trees dynamically
snmp_section_vega_hwstatus = SimpleSNMPSection(
name="vega_hwstatus",
parse_function=parse_vega_hwstatus,
detect=startswith(".1.3.6.1.4.1.14988.1.1.3.100.1.", ""),
fetch=[
SNMPTree(base='.1.3.6.1.4.1.14988.1.1.3.100.1.2', oids=[]), # labels (dynamic)
SNMPTree(base='.1.3.6.1.4.1.14988.1.1.3.100.1.3', oids=[]), # values (dynamic)
]
)
def discover_vega_hwstatus(section):
# section is sensor_map: {label: value}
for label in section:
# Only show the services you want (including fan speed!)
if any(x in label.lower() for x in ['temp', 'fan']):
yield Service(item=label)
def check_vega_hwstatus(item, params, section):
thresholds = params.get("temperature_levels", (75, 90)) # (warn, crit)
warn, crit = thresholds
value = section.get(item)
if value is None:
yield Result(state=State.UNKNOWN, summary=f"No value for {item}")
return
# Set rules for temperature and fan speed
if "temp" in item.lower():
if value >= crit:
state = State.CRIT
elif value >= warn:
state = State.WARN
else:
state = State.OK
yield Result(state=state, summary=f"{item} is {value} °C (warn {warn}, crit {crit})")
elif "fan" in item.lower():
# Example: treat 0 as CRIT for fan, otherwise OK
state = State.CRIT if value == 0 else State.OK
yield Result(state=state, summary=f"{item} is {value} RPM")
else:
# Default: just show value OK
yield Result(state=State.OK, summary=f"{item} is {value}")
check_plugin = CheckPlugin(
name="vega_hwstatus",
service_name="Vega %s",
sections=["vega_hwstatus"],
discovery_function=discover_vega_hwstatus,
check_function=check_vega_hwstatus,
check_default_parameters={"temperature_levels": (75, 90)},
check_ruleset_name="vega_hwstatus",
)
Output of “cmk --debug -vvn hostname”:
OMD[monitoring_site]:~/local/lib/check_mk/base/plugins/agent_based$ cmk --debug -vvn HOSTNAME
value store: synchronizing
Trying to acquire lock on /omd/sites/monitoring_site/tmp/check_mk/counters/HOSTNAME
Got lock on /omd/sites/monitoring_site/tmp/check_mk/counters/HOSTNAME
value store: loading from disk
Releasing lock on /omd/sites/monitoring_site/tmp/check_mk/counters/HOSTNAME
Released lock on /omd/sites/monitoring_site/tmp/check_mk/counters/HOSTNAME
Checkmk version 2.3.0p18
+ FETCHING DATA
Source: SourceInfo(hostname='HOSTNAME', ipaddress='IP', ident='snmp', fetcher_type=<FetcherType.SNMP: 7>, source_type=<SourceType.HOST: 1>)
[cpu_tracking] Start [address]
Read from cache: SNMPFileCache(HOSTNAME, path_template=/omd/sites/monitoring_site/tmp/check_mk/data_source_cache/snmp/{mode}/{hostname}, max_age=MaxAge(checking=0, discovery=90.0, inventory=90.0), simulation=False, use_only_cache=False, file_cache_mode=6)
Not using cache (Too old. Age is 49 sec, allowed is 0 sec)
SNMP scan:
Getting OID .1.3.6.1.2.1.1.1.0: Running 'snmpget -v3 -l noAuthNoPriv -u public -m "" -M "" -On -OQ -Oe -Ot IP .1.3.6.1.2.1.1.1.0'
SNMP answer: ==> ["RouterOS CRS310-8G+2S+"]
b'RouterOS CRS310-8G+2S+'
Getting OID .1.3.6.1.2.1.1.2.0: Running 'snmpget -v3 -l noAuthNoPriv -u public -m "" -M "" -On -OQ -Oe -Ot IP .1.3.6.1.2.1.1.2.0'
SNMP answer: ==> [.1.3.6.1.4.1.14988.1]
b'.1.3.6.1.4.1.14988.1'
Using cached OID .1.3.6.1.2.1.1.1.0: 'RouterOS CRS310-8G+2S+'
Using cached OID .1.3.6.1.2.1.1.2.0: '.1.3.6.1.4.1.14988.1'
... (output truncated for brevity, replace all sensitive info)
[snmp] Success, [piggyback] Success (but no data found for this host), execution time 0.9 sec | execution_time=0.860 user_time=0.030 system_time=0.000 children_user_time=0.090 children_system_time=0.070 cmk_time_snmp=0.670 cmk_time_agent=0.000
Can someone help me figure out why my plugin isn’t reading the OID data, even though normal SNMP walks work?
Just as a Heads up Normal SNMP-Monitoring works, its just this custom one for Temperatures.
If you need more details or direct SNMP walk outputs, let me know!
Thanks a lot!
Best regards
Jan




