Notification in multisite / Spool dir full of file

Hello community

I have a master and 5 slave configured with replication, all instances running 2.0.0p1CEE.
I want all notification (mail/sms) issued from master.

in my main site it’s set as “Asynchronous local delivery by notification spooler”
and i did specific configuration for slave instance for which i set
Notification spooling > Forward remote site by notification spooler

Actually

  • i do not receive mail or sms notification from slave.
  • Also when i check on slave, i can see a directory with many file which seems to be some notification file waiting to be processing
    example : /omd/sites/slaveinstance/var/check_mk/notify/spool/46b36255-1d6f-4725-8a34-5e4e3d3f51dd

I guess something missing in my configuration for achieve my purpose… could you point me what i miss ?

Thank you for your help
Regards

Hi @LAFONTA91 ,

at first, i would edit the settings of every slave site in the distributed monitoring and set it to “allow incoming connection”.
After you configured that for every slave, you can edit the global settings of the master site and add connections for every slave (connect to remote sides).

Hi,
you need also configure the notification daemon as listener in global configuration:
On “Notification spooler configuration” you need to switch on “Accept incoming TCP connections” on Master and “Connect to remote side” on Slaves.
Cheers,
Christian

You can check the state of the notification spooler on the master and slaves in the following log/statefile:

~/site/var/log/mknotifyd.state
~/site/var/log/mknotifyd.log

Great information you gave me !

As a network constraint, i had to Accept incoming TCP on my slave et create connect to remote side on my master. I suppose this is not a problem ? as long master is not a incoming and receiving in same time ?

I can see in the mknotifyd.state on my master ( im doing a watch every 2 s)

State:                     established
Status Message:           Successfully connected to slave:6555

then

State:                    cooldown
No data from slave:6555 for 13 seconds! Declaring as dead.

it repeat in loop
and i have no notification comming …

I set verbosity of log to debugging, but i see no log on slave in ~omd/var/log/*

on the master only contain loop of

2021-04-09 16:27:00,191 [40] [cmk.mknotifyd] No data from slave:6555 for 13 seconds! Declaring as dead.
2021-04-09 16:27:20,197 [20] [cmk.mknotifyd] Connection to slave port 6555 in progress
2021-04-09 16:27:20,199 [20] [cmk.mknotifyd] Successfully connected to slave:6555

Where i can search ?

@ChristianM @aeckstein I think i point out the problem

The connection seem cant be make
when i do tcpdump i see only [Syn] packet to port 6555 without any response.
On the server, i did a netstat to see if some listener, i see something listening but also something who seems to eat …

tcp   LISTEN    11     10                0.0.0.0:6555             0.0.0.0:*      users:(("python3",pid=17735,fd=5))
tcp   ESTAB     0      2504929         127.0.0.1:36032          127.0.1.1:6555   users:(("python3",pid=17735,fd=6))
tcp   ESTAB     81610  0               127.0.1.1:6555           127.0.0.1:36032  users:(("python3",pid=17735,fd=7))

as you can see, all 3 are same process… (python3 /omd/sites/slave/bin/mknotifyd)

I check again slave configuration and there is juste the “accept TCP connection” …
So … why does it connect itself ? Does it prevent others to connect ?

On the slave host i cant connect 6555 ( telnet localhost 6555 return : Unable to connect to remote host: Connection timed out )

I did reload the omd and now it’s working.
But i also have to edit manually the configuration on the slave… because local configuration was not what i fixed on the site exception on the master…
anyway … thank you for yours answer it help to understand how to fix my problems.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact @fayepal if you think this should be re-opened.