Custom check not getting executed - permanent staying as pending

Hello Everyone,

Used Version: check_mk_raw 1.5.0p21
OS: Debian 10 Buster
I am having some problems getting my own check to work.
I have used the following sites to write my check:
(my check is only server side as the check_mk_agent already gives all the information I need)
https://checkmk.de/cms_legacy_devel_agentbased.html
https://www.steinkogler.org/2016/08/21/check-mk-write-your-own-check/

The inventory function works fine, as I can add the service via the website:

The check function also works fine, but only in the Terminal with ‘cmk -nv testhost’:

Check_MK version 1.5.0p21
+ FETCHING DATA
 [agent] Execute data source
 [piggyback] Execute data source
CPU utilization      OK - 0.0% used, user perc: 3.7 %, privileged perc: 3.5 %, 12 CPUs
Disk IO SUMMARY      OK - Read: 0.00 B/s, Write: 0.00 B/s, Average Read Wait: 0.00 ms, Average Write Wait: 0.00 ms, Average Read Queue Length: 0.00, Average Write Queue Length: 0.00
Filesystem C:/       OK - 25.1% used (24.41 of 97.43 GB), trend: +7.93 MB / 24 hours
Filesystem D:/       OK - 0.314% used (2.62 of 832.73 GB), trend: 0.00 B / 24 hours
Filesystem E:/       OK - 89.9% used (16.35 of 18.18 TB), trend: -14.15 GB / 24 hours
Filesystem F:/       OK - 0.0% used (0.00 B of 1.97 MB), trend: 0.00 B / 24 hours
Interface 1          OK - [Intel[R] I350 Gigabit Network Connection] (Connected) 1 Gbit/s, in: 23.53 MB/s(19.7%), out: 902.37 kB/s(0.7%)
Memory and pagefile  OK - Memory usage: 62.6% (4.99 GB/7.97 GB), Commit charge: 14.7% (2.35 GB/15.95 GB)
Posa Processes       OK - 23 - All Processes running 15265.1 MB virtual 2259.8 MB physical
Processor Queue      OK - 15 min load 0.00 at 12 Cores (0.00 per Core)
Services Summary     OK - 128 services, 49 services in autostart - of which 4 services are stopped (Emulex_HBA_Management, hcmagent, sppsvc, upnphost), 0 services stopped but ignored
System Time          OK - Offset is 0 sec (warn/crit at 30/60 sec)
Uptime               OK - Up since Wed Feb 27 13:37:15 2019 (358d 18:37:25)
OK - [agent] Version: 1.5.0p21, OS: windows, execution time 0.2 sec | execution_time=0.210 user_time=0.030 system_time=0.000 children_user_time=0.000 children_system_time=0.000 cmk_time_agent=0.178

For some reason the check never returns any output in the webui and always stays pending:
PEND [Posa Processes]
There should have been a screenshot here but I can only upload one, as I am new here.

Now I think there is a problem with check_mk getting the plugin-output to nagios, as nagios doesn’t recive any plugin-output:

# nagios.log
[1582239600] CURRENT SERVICE STATE: testhost;Check_MK;OK;HARD;1;OK - [agent] Version: 1.5.0p21, OS: windows, execution time 0.4 sec
[1582239600] CURRENT SERVICE STATE: testhost;Check_MK Discovery;UNKNOWN;HARD;1;(null)
[1582239600] CURRENT SERVICE STATE: testhost;Disk IO SUMMARY;OK;HARD;1;OK - Read: 0.00 B/s, Write: 0.00 B/s, Average Read Wait: 0.00 ms, Average Write Wait: 0.00 ms, Average Read Queue Length: 0.00, Average Write Queue Length: 0.00
[1582239600] CURRENT SERVICE STATE: testhost;Filesystem C:/;OK;HARD;1;OK - 29.6% used (23.09 of 78.12 GB), trend: +1.20 MB / 24 hours
[1582239600] CURRENT SERVICE STATE: testhost;Filesystem D:/;OK;HARD;1;OK - 1.18% used (4.56 of 386.50 GB), trend: 0.00 B / 24 hours
[1582239600] CURRENT SERVICE STATE: testhost;Filesystem E:/;OK;HARD;1;OK - 0.0% used (0.00 B of 1.97 MB), trend: 0.00 B / 24 hours
[1582239600] CURRENT SERVICE STATE: testhost;Interface 1;OK;HARD;1;OK - [Intel[R] 82574L Gigabit Network Connection] (Connected) 1 Gbit/s, in: 2.85 MB/s(2.4%), out: 1.99 kB/s(0.0%)
[1582239600] CURRENT SERVICE STATE: testhost;Memory and pagefile;OK;HARD;1;OK - Memory usage: 31.7% (2.54 GB/7.99 GB), Commit charge: 16.6% (2.65 GB/15.98 GB)
[1582239600] CURRENT SERVICE STATE: testhost;Posa Processes;OK;HARD;1;
[1582239600] CURRENT SERVICE STATE: testhost;Processor Queue;OK;HARD;1;OK - 15 min load 0.08 at 8 Cores (0.01 per Core)
[1582239600] CURRENT SERVICE STATE: testhost;Services Summary;OK;HARD;1;OK - 127 services, 45 services in autostart - of which 2 services are stopped (sppsvc, upnphost), 0 services stopped but ignored
[1582239600] CURRENT SERVICE STATE: testhost;System Time;OK;HARD;1;OK - Offset is 0 sec (warn/crit at 30/60 sec)
[1582239600] CURRENT SERVICE STATE: testhost;Uptime;OK;HARD;1;OK - Up since Tue Oct 30 14:30:43 2018 (478d 09:28:28)

Also here is the check declarations:

# declare the check to Checkmk
check_info["ps.posa"] = {
    'check_function'          :  check_ps_posa,
    'inventory_function'      :  inventory_ps_posa,
    'service_description'     : 'Posa',
    "node_info"               : True, # add first column with actual host name
    "has_perfdata"            : True,
    "group"                   : "posa",
}

And here is the raw output my check function returns to cmk:
(0, '15 - All Processes running 8814.3 MB virtual 1132.3 MB physical', [('vsz', 9025796, '', ''), ('rss', 1159472, '', '')])

I can only guess that my format for my check output is somewhat wrong.

Any help would be great. :slight_smile:
Thanks in advance.

First of all, why do you write your own check for this? >Check_MK is able to get this data with “Process discovery” and “State and count of process” or “Manual Check”.
From my point of view there is no need to write your own check for this.

1 Like

Hello ChristianM,

i am well aware of the “State and count of process” check that can be used,
however this check fulfills not all of my requirements.
With all the different combinations of processes on my hosts,
it would be a lot of work to check them with the “State and count of process”.

Thats the reason why a custom check is way simpler and easier to maintain.

Hello Everyone,

has someone else any idea why I am not getting the plugin output to nagios?
If needed I can of cause provide logfiles, but i am currently at a loss where to search for error messages or any lead to a solution.

On a side note:
If I change the the group in the check declaration to group : "ps" , which is the same group the default checks ps and ps.perf use, I get the following error:

Checkgroup ps has checks with and without item! At least one of the checks in this group needs to be changed (With item: ps, ps.perf, Without item: ps.posa)

Has anyone an idea of what this ‘item’ is and what check_mk is expecting my plugin to return?
I can remember reading of it somewhere, but the article was incomplet and i currently can’t find the article anymore.

The ps check is normally designed to give back items, means you can have more than one services checked by the same check. Like ps_1, ps_2, ps_3, … and so on.

I don’t know how your discovery (old inventory) function and check function is designed but interpreting the error it’s not able to handle more than one item.

I would recvomend you to not use ps.posa as check name and use a totally custom for. may this helps get you around this problem to design your check metching all functions of ps and ps.perf.

If something goies wrong inside the check function your service could stay pending instead of throwing an exception.

Maybe this explanaition helps you designing your own check.

Hello tosch,

thank you for that explaination of the item. It now makes a lot more sense this way, a simple change to the service_description to 'service_description' : 'Posa %s', fixed the item error.

On the change of the check name, that seems not possible as I need the agent output:

<<<ps:sep(9)>>>
(SYSTEM,0,24,0,0,0,0,176416674744871,-4294967296,-4294967284,1582711495)        System Idle Process
(SYSTEM,4404,1332,0,4,0,0,2537117427469,1802,233,31440646)      System
(\\NT-AUTORITÄT\SYSTEM,5560,1608,0,612,0,156001,483291098,43,3,31440646)        smss.exe
(\\NT-AUTORITÄT\SYSTEM,51724,5628,0,704,3,304981955,830237322,1293,10,31440593) csrss.exe
(\\NT-AUTORITÄT\SYSTEM,49340,5364,0,764,2,0,312002,93,3,31440590)       wininit.exe

And cmk only give me this data as the ‘info’ variable if my check is called ‘ps.name’ , so unless there is another way to get this data my check needs to be called ps.posa.

It seems I have to understand the ps and ps.perf checks to find out want is still missing.

Thank you so far.

Hi,

has anyone solution/explanation to this issue? I also tried to write custom plugin for checking directory size with inventory function. Services are inventorized when doing service discovery, but after Activation changes, check are always pending.
image

I found the solution. After activating changes, I logged in as site user on server and run cmk -R. I don’t know why, but after that custom plugin started to work. 1.6.0p11 Enterprise.
image