Email notification doesn't work

This is a worrying part, as it seems that there is a sanity check before a notification is sent out, and because it fails it does not send notifications.

Now in my case - as i’m solo - and usually have the dashboard open i do not really have complicated notifications with rules.

But my method of troubleshooting is always ‘return to basics, and test’:

Can you try and disable all existing notification-rules for now and make a new (very basic) notification via email like this:


  • replace the entry in ‘the following users’ to your user (assuming it has an email address configured)
  • replace Matched host to one host only in your monitoring. (i used a development box)

Save the rule, and then go to to the defined host / services list and find under commands the option:
image
If you do not see the option use the circle with the 3 dots, it should expand the menu and makes the option/entry visible.

When clicked you get this:
image
Do not fill in anything, just click the button ‘Critical’.

When clicked you will get a warning:
image

Just Confirm to trigger the fake status.

Then check out the debug log and check if the evaluation of the Notification viability test now passes.

If it does, then you should now also see mentions of notification being sent, and subsequently log entries in your postfix’s /var/log/maillog

  • Glowsome

Hello Glowsome,

Thank you very much for the detailed description.
I’ve done exactly that now.
Unfortunately the log files are
/var/log/maillog
/omd/sites/$MySite$/var/log/notify.log
still empty.
But that’s what there is now in the
/omd/sites/$MySite$/var/nagios/debug.log
new messages that may be helpful and point to the error.

[1695887681.627628] [032.0] [pid=108922] ** Service Notification Attempt ** Host: '$HOSTNAME$', Service: 'Postfix status', Type: 0, Options: 0, Current State: 0, Last Notification: Thu Jan  1 01:00:00 1970
[1695887681.627633] [001.0] [pid=108922] check_service_notification_viability()
[1695887681.627638] [001.0] [pid=108922] check_time_against_period()
[1695887681.627646] [001.0] [pid=108922] check_service_dependencies()
[1695887681.627654] [001.0] [pid=108922] check_host_dependencies()
[1695887681.627659] [032.0] [pid=108922] Notification viability test passed.
[1695887681.627663] [001.0] [pid=108922] create_notification_list_from_service()
[1695887681.627667] [001.0] [pid=108922] should_service_notification_be_escalated()
[1695887681.627672] [001.0] [pid=108922] check_contact_service_notification_viability()
[1695887681.627676] [001.0] [pid=108922] check_time_against_period()
[1695887681.627682] [001.0] [pid=108922] add_notification() start
[1695887681.627686] [001.0] [pid=108922] find_notification() start
[1695887681.627689] [001.0] [pid=108922] check_contact_service_notification_viability()
[1695887681.627693] [001.0] [pid=108922] check_contact_service_notification_viability()
[1695887681.627702] [001.0] [pid=108922] notify_contact_of_service()
[1695887681.627713] [001.0] [pid=108922] get_raw_command_line_r()
[1695887681.627725] [001.0] [pid=108922] process_macros_r()
[1695887681.627857] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_HOSTEC_SL'!
[1695887681.627870] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_SERVICEEC_SL'!
[1695887681.627881] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_SERVICEEC_SL'!
[1695887681.627896] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_HOSTEC_CONTACT'!
[1695887681.627911] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_SERVICEEC_CONTACT'!
[1695887681.627957] [001.0] [pid=108922] process_macros_r()
[1695887681.628006] [001.0] [pid=108922] my_system_r()

best regards
Sebastian

Well, we did get a bit further, as we now see that the “Notification viability passed”.
Still a bump in the road, as it now triggers errors further on:

[1695887681.627857] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_HOSTEC_SL'!
[1695887681.627870] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_SERVICEEC_SL'!
[1695887681.627881] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_SERVICEEC_SL'!
[1695887681.627896] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_HOSTEC_CONTACT'!
[1695887681.627911] [2048.0] [pid=108922]  WARNING: An error occurred processing macro '_SERVICEEC_CONTACT'!

I searched for this type of error, and the only ‘hit’ leads to a post from way back (2014), so even tho the error matched i doubt the cause is the same.
So i am a bit at a dead end.

If it were my box, i would attempt/go this way:

  • backup the box (if its a VM snapshot it)
  • check all previous (still present) checkmk packages, and remove them:
# as root - check which versions of CMK are installed
omd versions
# output from my box:
2.1.0p32.cre
2.1.0p33.cre (default)

# as root remove all old CMK versions not in use still on the box
omd cleanup
#output (i already issued this, so no old versions present anymore)
2.1.0p33.cre         Keeping this version, since it is the default.

Then i would force a reinstall of the CMK package, just to make sure that all installed is original.

# as root stop omd (just to be sure)
omd stop

# as root force the reinstall of the package
rpm -ivh <checkmk--raw-rpm> --force

# as root restart omd
omd start

If after that the issue still is there i would go down the road and migrate to a new box - which i recently executed when i went from Rocky Linux 8 → 9

  • Glowsome

Hello Glowsome,

Thanks for looking and the tips.
I do a cleanup after every update as soon as everything has been stable for a week, so there was only the current version on our system.
Unfortunately, reinstalling --force didn’t change the situation, so we probably won’t be able to avoid a new installation.
But that means that you also suspect that config files or settings are defective and will not be reset or overwritten when the environment is updated.
If only you knew what those could be :slight_smile:

Best regards
Sebastian

Hello everyone,

When we looked through the interface, we noticed that when we click on Show Analysis under Setup → Events → Notification configuration, we get an error message:

Currently there are no unsent notification bulks pending.
Internal error: source code string cannot contain null bytes

An internal error occured while processing your request. You can report this issue to the Checkmk team to help fixing this issue. Please open the [crash report page](https://%IP from the Server%/%CheckMKSite%/check_mk/crash.py?crash_id=%Crash ID%&mode=notifications&site=%CheckMKSite%) and use the form for reporting the problem.

The crash report then generated contains the following data

Crash Report:
Exception   
ValueError (source code string cannot contain null bytes)

Traceback   
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/wsgi/applications/checkmk.py", line 241, in _process_request
    resp = page_handler()
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/wsgi/applications/utils.py", line 56, in _call_auth
    func()
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/pages.py", line 195, in <lambda>
    return (lambda hc: lambda: hc().handle_page())(handle_class)
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/pages.py", line 48, in handle_page
    self.page()
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/pages.py", line 165, in <lambda>
    "page": lambda self: self._wrapped_callable[0](),
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/wato/page_handler.py", line 95, in page_handler
    _wato_page_handler(current_mode, mode_permissions, mode_class)
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/wato/page_handler.py", line 172, in _wato_page_handler
    mode.handle_page()
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/plugins/wato/utils/base_modes.py", line 152, in handle_page
    return self.page()
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/wato/pages/notifications.py", line 647, in page
    self._show_notification_backlog()
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/gui/wato/pages/notifications.py", line 737, in _show_notification_backlog
    backlog = store.load_object_from_file(
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/utils/store/__init__.py", line 171, in load_object_from_file
    return ObjectStore(Path(path), serializer=DimSerializer()).read_obj(default=default)
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/utils/store/_file.py", line 97, in read_obj
    return self._serializer.deserialize(raw) if raw else default
  File "/omd/sites/%CheckMKSite%/lib/python3/cmk/utils/store/_file.py", line 63, in deserialize
    return literal_eval(raw.decode("utf-8"))
  File "/omd/sites/%CheckMKSite%/lib/python3.9/ast.py", line 62, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/omd/sites/%CheckMKSite%/lib/python3.9/ast.py", line 50, in parse
    return compile(source, filename, mode, flags,


Local Variables
{'feature_version': -1,
 'filename': '<unknown>',
 'flags': 1024,
 'mode': 'eval',
 'source': A LOT OF SERVER NAMES AND IPs
and then
}\n"
           "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
{(      
A LOT OF SERVER NAMES AND IPs
'type_comments': False}


Crash Type  gui
Time  2023-11-29 17:23:10
Operating System  CentOS Linux release 7.9.2009 (Core)
Checkmk Version   2.1.0p33
Edition     cre
Core  nagios
Python Version    3.9.16 (main, Jul 25 2023, 22:46:22) [GCC 12.2.0]
Python Module Paths     /omd/sites/%CheckMKSite%/local/lib/python3
/omd/sites/%CheckMKSite%/lib/python3/plus
/omd/sites/%CheckMKSite%/lib/python39.zip
/omd/sites/%CheckMKSite%/lib/python3.9
/omd/sites/%CheckMKSite%/lib/python3.9/lib-dynload
/omd/sites/%CheckMKSite%/lib/python3.9/site-packages
/omd/sites/%CheckMKSite%/lib/python3
Details
Page  wato.py
Request Method    GET
HTTP Parameters   
POST / GET Variables
mode  notifications
Referer     https://%SERVERNAME%/%CheckMKSite%/check_mk/index.py?start_url=%2F%CheckMKSite%%2Fcheck_mk%2Fdashboard.py
Username    
User Agent  Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36
Mobile GUI  
SSL   
Language 

Is this information perhaps helpful and points to the error, or do we have another problem here?

Best regards

I have the same problem. no one knows how to solve it! This is a huge bug without any support.

Unfortunately, an upgrade to the latest version (2.2.0p18) didn’t change the problem.
The error under Show Analysis under Setup → Events → Notification configuration has remained and continues to generate the bug report that cannot be sent because the system does not send emails.

I did the upgrade to this version also did not solve it

The question that arises for me is, what has broken here that cannot be repaired or overwritten/replaced by an update or an upgrade?

After my last posting I went looking again and found the error.
I backed up and deleted the files backlog.mk and all .backlog.mk.new34928 from the /omd/sites/%sitename%/var/check_mk/notify/ directory, and then created a new empty file backlog.mk and this correctly authorized, and then after restarting the site the emails suddenly arrived as if by magic

Did you see my post from yesterday and can you relate to that?

Hello @sebastian.rohde I’ve tested your suggestion, but unfortunately, it did not work as expected. Additionally, I observed that when I run a Fake Check Result, there is no data being logged in the notify.log file.

That’s a pity, maybe something else is broken in your system