Hi Checkmk users,
It seems like i’m encountering maybe similar issues as some others on 2.1.0p26 with rest api discovery not reliably adding services to monitored state.
Reference to another case: Service Discovery via API not working as expected
We have been using the older webapi for last couple years with no issue for automating adding our new servers.
I have been working this week on a rewrite for the new rest api as we recently upgraded 2.0.0p21 to 2.1.0p26.
I have our new rest api script for adding server basically working fine except the discovery part does not reliably add the services to a monitored state.
I’m using a bash script with curl with no issues except discovery so far. It is simple enough.
I am familiar with the process and now familiar with this new rest api for basic interaction.
I am following the same workflow and i’ve reviewed the built in and online related api docs and examples.
create_host
discover_services
activate_changes
I’ve tried refresh, new, fix_all modes seperately and as i will mention below back-to-back for the discovery.
I have tested putting 1-2 minute sleep delay right after the discovery runs in my script to let the background discovery for sure complete for the new host.
I then activate and all the services go to unmonitored.
I have even scripted discovery and activate to run in the script back to back twice (first with ‘new’ and 2nd time with 'fix_all) with delays each time between discovery and activation.
The webapi was pretty reliable for this compared to how the rest api is behaving for this.
Here is an example of my script run where the host does not exist yet in checkmk at all and i try discover/activate twice (even though twice should not be required) for this.
200 OK - host created
200 OK - host discovery
Waiting 60 seconds for checkmk background discovery process to complete (mode=new)
200 OK - activate changes
200 OK - host discovery
Waiting 60 seconds for checkmk background discovery process to complete (mode=fix_all)
200 OK - activate changes
I noticed if i run my entire script yet again a 2nd time after it will error on the host already added but, continue the script execution and then run discovery again and finally it will succeed in moving the unmonitored services to monitored. From that result it seems like there is some larger time delay needed?
We are not in any rush but, we want to switch to the rest api for our server adds so we are not stuck on 2.1 since 2.2 i think removes the old webapi.
Might anyone have any advice for this issue?
CMK version:
2.1.0p26
OS version:
AlmaLinux 8
4.18.0-425.19.2.el8_7.x86_64
Error message:
43 unmonitored services (postfix_mailq_status:1, local:13, nfsmounts:2, diskstat:1, md:2, ipmi:1, checkmk_agent:1, lnx_thermal:2, logwatch:3, cpu_threads:1, mem_linux:1, cpu_loads:1, ps:1, tcp_conn_stats:1, uptime:1, kernel_performance:1, systemd_units_services_summary:1, postfix_mailq:1, df:2, chrony:1, kernel_util:1, lnx_if:2, mounts:2)WARN, no vanished services found, 1 new host labelsWARN