Stormshield_packets seems no work anymore since my upgrade in 2.0.0p1

Hello

since i upgrade to 2.0.0 Cee i observe a problem with a stormshield appliance on a slave site.

I can not discover any more some service and so have the metric associated with. (i have some service discover but not other , so snmp traffic seem not to be the issue)

Problem occur with the packet plugin stormshield.

I tried the command
cmk -i --plugins stormshield_packets Myhost
give me no output

cmk -L | grep storm output the plugin.

i run into the plugin and i see it look for oid .1.3.6.1.4.1.11256.1.4.1.1
I get the snmpwalk from the check, and i found a series. so the data are in the output.

Seems the plugin is not triggered… or not well executed.

Did someone have same problem ?

I try with 2.0.0 and 2.0.0p1

What do you get if you do a cmk --debug -vvII YourHost?
I think you get some error messages that the stormshield plugin cannot be converted to 2.0

Hello Andreas,

You right, it seems there is some crash involved.

$  cmk --debug -vvII  MyHost
Discovering services and host labels on: MyHost
MyHost:
+ FETCHING DATA
  Source: SourceType.HOST/FetcherType.TCP
[cpu_tracking] Start [7f3c2992dd00]
Connecting via TCP to 192.168.0.1:6556 (5.0s timeout)
[cpu_tracking] Stop [7f3c2992dd00 - Snapshot(process=posix.times_result(user=0.0, system=0.0, children_user=0.0, children_system=0.0, elapsed=5.010000001639128))]
Try aquire lock on /omd/sites/company/var/check_mk/crashes/base/5a7a9804-8c15-11eb-b8bc-000c29f47f56/crash.info
Got lock on /omd/sites/company/var/check_mk/crashes/base/5a7a9804-8c15-11eb-b8bc-000c29f47f56/crash.info
Releasing lock on /omd/sites/company/var/check_mk/crashes/base/5a7a9804-8c15-11eb-b8bc-000c29f47f56/crash.info
Released lock on /omd/sites/company/var/check_mk/crashes/base/5a7a9804-8c15-11eb-b8bc-000c29f47f56/crash.info
Traceback (most recent call last):
  File "/omd/sites/company/bin/cmk", line 92, in <module>
    exit_status = modes.call(mode_name, mode_args, opts, args)
  File "/omd/sites/company/lib/python3/cmk/base/modes/__init__.py", line 69, in call
    return handler(*handler_args)
  File "/omd/sites/company/lib/python3/cmk/base/modes/check_mk.py", line 1531, in mode_discover
    discovery.do_discovery(
  File "/omd/sites/company/lib/python3/cmk/base/discovery.py", line 370, in do_discovery
    fetcher_messages=list(
  File "/omd/sites/company/lib/python3/cmk/base/checkers/_checkers.py", line 246, in fetch_all
    raw_data = source.fetch()
  File "/omd/sites/company/lib/python3/cmk/base/checkers/_abstract.py", line 162, in fetch
    with self._make_fetcher() as fetcher:
  File "/omd/sites/company/lib/python3/cmk/fetchers/_base.py", line 189, in __enter__
    self.open()
  File "/omd/sites/company/lib/python3/cmk/fetchers/tcp.py", line 73, in open
    self._socket.connect(self.address)
socket.timeout: timed out

I see there is ton of file in /omd/sites/company/var/check_mk/crashes/[base]|[check] .
I did a omd stop, wipe of all the file, omd start, but same result

content of the crash file is

cat /omd/sites/company/var/check_mk/crashes/base/1341c83e-8c17-11eb-b748-000c29f47f56/crash.info
{"time": 1616531629.4787204, "os": "Debian GNU/Linux 10 (buster)", "version": "2.0.0p1", "edition": "cee", "core": "cmc", "python_version": "3.8.7 (default, Feb  3 2021, 02:52:08) \n[GCC 10.1.0]", "python_paths": ["/opt/omd/versions/2.0.0p1.cee/bin", "/omd/sites/company/local/lib/python3", "/omd/sites/company/lib/python38.zip", "/omd/sites/company/lib/python3.8", "/omd/sites/company/lib/python3.8/lib-dynload", "/omd/sites/company/lib/python3.8/site-packages", "/omd/sites/company/lib/python3"], "id": "1341c83e-8c17-11eb-b748-000c29f47f56", "crash_type": "base", "exc_type": "timeout", "exc_value": "timed out", "exc_traceback": [["/omd/sites/company/bin/cmk", 92, "<module>", "exit_status = modes.call(mode_name, mode_args, opts, args)"], ["/omd/sites/company/lib/python3/cmk/base/modes/__init__.py", 69, "call", "return handler(*handler_args)"], ["/omd/sites/company/lib/python3/cmk/base/modes/check_mk.py", 1531, "mode_discover", "discovery.do_discovery("], ["/omd/sites/company/lib/python3/cmk/base/discovery.py", 370, "do_discovery", "fetcher_messages=list("], ["/omd/sites/company/lib/python3/cmk/base/checkers/_checkers.py", 246, "fetch_all", "raw_data = source.fetch()"], ["/omd/sites/company/lib/python3/cmk/base/checkers/_abstract.py", 162, "fetch", "with self._make_fetcher() as fetcher:"], ["/omd/sites/company/lib/python3/cmk/fetchers/_base.py", 189, "__enter__", "self.open()"], ["/omd/sites/company/lib/python3/cmk/fetchers/tcp.py", 73, "open", "self._socket.connect(self.address)"]], "local_vars": "eydzxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==", "details": {"argv": ["/omd/sites/company/bin/cmk", "--debug", "-vvII", "Myhost"], "env": {"SHELL": "/bin/bash", "OMD_ROOT": "/omd/sites/company", "NAGIOS_PLUGIN_STATE_DIRECTORY": "/omd/sites/company/var/monitoring-plugins", "PWD": "/omd/sites/company", "LOGNAME": "company", "MANPATH": "/omd/sites/company/share/man:", "MODULEBUILDRC": "/omd/sites/company/.modulebuildrc", "HOME": "/omd/sites/company", "LANG": "C.UTF-8", "PERL5LIB": "/omd/sites/company/local/lib/perl5/lib/perl5:/omd/sites/company/lib/perl5/lib/perl5:", "OMD_SITE": "company", "TERM": "xterm", "USER": "company", "PERL_MM_OPT": "INSTALL_BASE=/omd/sites/company/local/lib/perl5/", "SHLVL": "1", "LD_LIBRARY_PATH": "/omd/sites/company/local/lib:/omd/sites/company/lib", "REQUESTS_CA_BUNDLE": "/omd/sites/company/var/ssl/ca-certificates.crt", "LC_ALL": "C.UTF-8", "PATH": "/omd/sites/company/lib/perl5/bin:/omd/sites/company/local/bin:/omd/sites/company/bin:/omd/sites/company/local/lib/perl5/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games", "MP_STATE_DIRECTORY": "/omd/sites/company/var/monitoring-plugins", "MAIL": "/var/mail/company", "MAILRC": "/omd/sites/company/etc/mail.rc", "_": "/omd/sites/company/bin/cmk"}}}

Is this because the plugin need adapted to cmk2.0 or my host lack some ressource ?

Hi,

you try to connect to a SNMP Devices via check_mk Agnet (port 6556). This will not work. So, configure your host to “No agent” and SNMP V2c.

Cheers,
Christian

Hello Christian,

Yes you right. Sorry, i did many test and i let the check with “Normal checkagent or special agent if configured”.
So i put back no agent.

Also i move the check being done from the central site for i can compare with Myhost2 which is the exact same model stormshield and Which Work !

Here again the ouput of CMK command:

cmk -vII MyHost
Discovering services and host labels on: MyHost
MyHost:
+ FETCHING DATA
[SNMPFetcher] Execute data source
No piggyback files for 'MyHost'. Skip processing.
No piggyback files for '192.168.0.1'. Skip processing.
[PiggybackFetcher] Execute data source
+ PARSE FETCHER RESULTS
Received no piggyback data
+ EXECUTING HOST LABEL DISCOVERY
+ PERFORM HOST LABEL DISCOVERY
+ EXECUTING DISCOVERY PLUGINS (7)
  6 hr_fs
  4 if64
  1 mem_used
  1 snmp_info
  1 uptime
SUCCESS - Found 13 services, no host labels

and myhost2 is

cmk -vII MyHost2
Discovering services and host labels on: MyHost2
MyHost2:
+ FETCHING DATA
[SNMPFetcher] Execute data source
No piggyback files for 'MyHost2'. Skip processing.
No piggyback files for '192.168.0.2'. Skip processing.
[PiggybackFetcher] Execute data source
+ PARSE FETCHER RESULTS
Received no piggyback data
+ EXECUTING HOST LABEL DISCOVERY
+ PERFORM HOST LABEL DISCOVERY
+ EXECUTING DISCOVERY PLUGINS (16)
  1 hr_cpu
  6 hr_fs
  4 if64
  1 mem_used
  1 snmp_info
  2 stormshield_cpu_temp
  1 stormshield_disk
  1 stormshield_info
 13 stormshield_packets
  3 stormshield_policy
  1 stormshield_route
 17 stormshield_services
  5 stormshield_updates
  1 uptime
SUCCESS - Found 57 services, no host labels

I did compare the snmpwalk from agent of myhost and myhost2, of course diff returning output, but the .1.3.6.1.4.1.11256.1.4.1.1 OIDs are there in the two.

When i run the commande line with --debug -vvII, i can see that on MyHost2 there is some bulk walk are done but not on MyHost ( that probably explain… but why ?)

stormshield_packets: Fetching data (SNMP walk cache is disabled)
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.4.1.1.2" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.4.1.1.3" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.4.1.1.6" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.4.1.1.11" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.4.1.1.12" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.4.1.1.16" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.4.1.1.23" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.4.1.1.24" on MyHost2
stormshield_policy: Fetching data (SNMP walk cache is disabled)
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.8.1.1.2" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.8.1.1.3" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.8.1.1.5" on MyHost2
stormshield_route: Fetching data (SNMP walk cache is disabled)
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.14.1.1.1" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.14.1.1.2" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.14.1.1.4" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.14.1.1.5" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.14.1.1.7" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.14.1.1.9" on MyHost2
stormshield_services: Fetching data (SNMP walk cache is disabled)
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.7.1.1.2" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.7.1.1.3" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.7.1.1.4" on MyHost2
stormshield_updates: Fetching data (SNMP walk cache is disabled)
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.9.1.1.2" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.9.1.1.3" on MyHost2
Executing BULKWALK of ".1.3.6.1.4.1.11256.1.9.1.1.4" on MyHost2

Hi, this sounds strange. When you run cmk --snmpwalk MyHost and the same at MyHost2, then you will find the output at ~/var/check_mk/snmpwalk. Please diff this filles.
You can also try to flush the MyHost with cmk --flush MyHost and run afterwards a new discovery with cmk -vvII MyHost.

Cheers, Christian

Interesting !!

First i try the flush and discovery. but nothing changed.
Also i ran the snmpwalk and it give nice information !!

cmk -v --snmpwalk MyHost2
MyHost2:
Walk on ".1.3.6.1.2.1"...
7643 variables.
Walk on ".1.3.6.1.4.1"...
9888 variables.
Wrote fetched data to /omd/sites/company/var/check_mk/snmpwalks/MyHost2.

cmk -v --snmpwalk MyHost
MyHost:
Walk on ".1.3.6.1.2.1"...
ERROR: SNMP error 0/-24 (Timeout)
Error: SNMP Error on MyHost: SNMP query timed out after 4 tries within 93.63 seconds
Walk on ".1.3.6.1.4.1"...
16477 variables.
Wrote fetched data to /omd/sites/company/var/check_mk/snmpwalks/MyHost.

So i added a rule for snmp timeout to 50 second.
It resolved that probleme

cmk -v --snmpwalk MyHost
MyHost:
Walk on ".1.3.6.1.2.1"...
31806 variables.
Walk on ".1.3.6.1.4.1"...
16442 variables.
Wrote fetched data to /omd/sites/company/var/check_mk/snmpwalks/MyHost
.

diff /omd/sites/company/var/check_mk/snmpwalks/MyHost /omd/sites/company/var/check_mk/snmpwalks/MyHost2 are pretty différent.

MyHost is version 3.7.18 LTSB
MyHost2 is version 3.7.17 LTSB

Tonight i will update MyHost2 to 3.7.18, and i waiting to broke my snmp graphe, ( suppose it come from 3.7.18 stormshield firwmare)

i can confirm . I did upgrade the stormshield FW Myhost2 and i lost also the stormshield metric :frowning:

I try upgrade the last ltsb 3.7.19 released at begining of the week fixing some ipsec issue but still same.

3.7.17 OK
3.7.18 KO
3.7.19 KO

For information the firewall is SN 700 model.
On a SN300 3.7.18 there is no problem / the stormshield metric still there.

that annoying …

First step is to compare the sysDesc OID and the sysObjectID OID for booth devices.
1.3.6.1.2.1.1.1 and 1.3.6.1.2.1.1.2
I think your check cannot identify the device as a Stormshield and don’t try to discover the checks.

1 Like

Ok,

Here the OID asked BEFORE upgrade ( when stormshield metric work) and below OID AFTER upgrade… when stop working. It seem that what u call the sysObjectID have changed.

.1.3.6.1.2.1.1.1.0 NS-BSD SN700XXXXXXXXXX i386
.1.3.6.1.2.1.1.2.0 .1.3.6.1.4.1.8072.3.2.8

.1.3.6.1.2.1.1.1.0 NS-BSD SN700XXXXXXXXXX i386
.1.3.6.1.2.1.1.2.0 .1.3.6.1.4.1.11256.2.0

Based on that observation… do you think it would be possible to … make a copy of the plugin… change the OID , and . magic work ?? ( maybe stupid, i just wonder)

Hi,
when you look at the checks in ~/share/check_mk/checks, the first OID point to a baracuda OID, the second to stormshield. You can try to copy the stormshield checks to ~/local/share/check_mk/checks and try to change the OID identifier in the checks.

Cheers,
Christian

The OID in the plugin share/check_mk/checks/stormshield_packets
.1.3.6.1.4.1.11256.1.4.1.1
seem correct… ( when i do snmpwalk on my host’s, it work for every one. )

the sysObjectId is .1.3.6.1.2.1.1.2.0 .1.3.6.1.4.1.11256.2.0 Does it can affect ?

Anyway i dont know what i can really do.
I try copy share/check_mk/checks/stormshield_packets to share/check_mk/checks/stormshield_packetsBugSN700 and put .1.3.6.1.4.1.11256.2.0 instead but didnt change anything.

Also i try put a rules " [Hosts without system description OID]" in case it could help. but it didnt.

I did the following test with the OID from check stormshield_info and stormshield_paquet.

cmk -v --snmpwalk --extraoid .1.3.6.1.4.1.11256.1.0 MyHost
MyHost:
Walk on ".1.3.6.1.2.1"...
31383 variables.
Walk on ".1.3.6.1.4.1"...
17611 variables.
Walk on ".1.3.6.1.4.1.11256.1.0"...
16319 variables.
Wrote fetched data to /omd/sites/company/var/check_mk/snmpwalks/MyHost.

cmk --usewalk --debug  -vII MyHost
Discovering services and host labels on: MyHost
MyHost:
+ FETCHING DATA
[SNMPFetcher] Execute data source
No piggyback files for 'MyHost'. Skip processing.
No piggyback files for '127.0.0.1'. Skip processing.
[PiggybackFetcher] Execute data source
+ PARSE FETCHER RESULTS
Received no piggyback data
+ EXECUTING HOST LABEL DISCOVERY
+ PERFORM HOST LABEL DISCOVERY
+ EXECUTING DISCOVERY PLUGINS (11)
  1 hr_cpu
  6 hr_fs
  4 if64
  1 mem_used
  1 snmp_info
  1 ucd_cpu_load
  4 ucd_diskio
  1 ucd_mem
  1 uptime
SUCCESS - Found 20 services, no host labels



cmk -v --snmpwalk --extraoid .1.3.6.1.4.1.11256.1.4.1.1 MyHost

MyHost:
Walk on ".1.3.6.1.2.1"...
31323 variables.
Walk on ".1.3.6.1.4.1"...
18017 variables.
Walk on ".1.3.6.1.4.1.11256.1.4.1.1"...
1638 variables.
Wrote fetched data to /omd/sites/company/var/check_mk/snmpwalks/MyHost.

cmk --usewalk --debug  -vII MyHost
Discovering services and host labels on: MyHost
MyHost:
+ FETCHING DATA
[SNMPFetcher] Execute data source
No piggyback files for 'MyHost'. Skip processing.
No piggyback files for '127.0.0.1'. Skip processing.
[PiggybackFetcher] Execute data source
+ PARSE FETCHER RESULTS
Received no piggyback data
+ EXECUTING HOST LABEL DISCOVERY
+ PERFORM HOST LABEL DISCOVERY
+ EXECUTING DISCOVERY PLUGINS (11)
  1 hr_cpu
  6 hr_fs
  4 if64
  1 mem_used
  1 snmp_info
  1 ucd_cpu_load
  4 ucd_diskio
  1 ucd_mem
  1 uptime
SUCCESS - Found 20 services, no host labels

So… what next to monitoring back my stormshield sn 700 ? any clues ?

The important part is not if snmpwalk is working on your device.
The checks looks for the following OIDs and if they cannot be found then you get no check.

OID .1.3.6.1.2.1.1.2.0 must start with .1.3.6.1.4.1.8072
and
OID .1.3.6.1.4.1.11256.1.0.1.0 must exists

In your system only the old firmware fulfills booth conditions.

Resolution for your problem can be the following

  • Copy the existing stormshield checks to “~/local/share/check_mk/checks/”
  • modify the scan function that it uses your new OIDs
  • try to discover the services

The problem is here that the enterprise OID 8072.3.2.8 is a generic OID for a FreeBSD operating system.

Without check modification no checks - is my conclusion.

Ok, andreas thank you much for pointing me the direction to follow.

I still have 2 question about the plan :

before cmk 2.0 the scan function was in a stormshield.define file.
Now i found a file
/opt/omd/versions/2.0.0p1.cee/lib/python3/cmk/base/check_legacy_includes/stormshield.py

Must i also copy this file to ~/local/share/check_mk/checks/ ? or there is another local folder for this kind of file ?

My second question is: After this will be done, it will work with the SN700 and their new firmware, what will happen to my other stormshield host ? i have a dozen of stormshield. of course i dont want put mess with the other.

Thank again Christian and Andrea for helping me with that problem

i tried:

  • copy the stormshield check from /opt/omd/versions/2.0.0p1.cee/share/check_mk/checks/stormshield_* to
    ~/local/share/check_mk/checks/

  • also create the ~/local/lib/python3/cmk/base/check_legacy_includes/ directory
    and copy the stormshield discovery from ~/lib/python3/cmk/base/check_legacy_includes/stormshield.py to ~/local/lib/python3/cmk/base/check_legacy_includes/

  • modifiy the scan function for use the new OIDs .

The discovery didnt find the new service. (neither web GUI , neither, cmk --debug -vII MyHost )

Without looking at the code i would recommend to include the scan function into your modified check.

If i have some time i can have a look at the code later today - but this is no promise :wink:

Andreas

as requested, here the stormshield scan i modified

cat ~/local/lib/python3/cmk/base/check_legacy_includes/stormshield.py
#!/usr/bin/env python3
# -- coding: utf-8 --
# Copyright (C) 2019 tribe29 GmbH - License: GNU General Public License v2
# This file is part of Checkmk (https://checkmk.com). It is subject to the terms and
# conditions defined in the file COPYING, which is part of this source code package.

# type: ignore[list-item,import,assignment,misc,operator]  # TODO: see which are needed in this file


def stormshield_scan_function(oid):
    return (oid(".1.3.6.1.2.1.1.2.0").startswith('.1.3.6.1.4.1.11256') and
            oid('.1.3.6.1.4.1.11256.1.0.1.0'))


def stormshield_cluster_scan_function(oid):
    # We have to use a different scan function here, so we only try getting
    # our snmp_info if information about the cluster exists
    return (oid(".1.3.6.1.2.1.1.2.0").startswith('.1.3.6.1.4.1.11256.2.0') and
            oid('.1.3.6.1.4.1.11256.1.11.*'))

Also, the stormshield support answer me about my request.
They did confirm to me they modified the snmp object sysObjectID to reflect the snmp provider from Net-SNMP to Stormshield.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact @fayepal if you think this should be re-opened.