Upgrade 2.3.0p45 -> 2.4.0p24 results in errors on activation

CMK version: 2.3.0p45 CRE/community
OS version: Rocky9

Error message:
During Upgrade errors are given regarding the Redfish plugin:

Creating temporary filesystem /omd/sites/comsolve/tmp...OK
Executing 'cmk-update-config --conflict ask --dry-run'
-| ATTENTION
-|   Some steps may take a long time depending on your installation.
-|   Please be patient.
-|
-| Cleanup precompiled host and folder files
-| Verifying Checkmk configuration...
-| Error loading rulespecs: [ValueError("cmk_addons.plugins.redfish.rulesets.datasource_program:rule_spec_redfish_datasource_programs: plug-in 'redfish' already defined at cmk.plugins.redfish.rulesets.datasource_program:rule_spec_redfish_datasource_programs"), ValueError("cmk_addons.plugins.redfish.rulesets.datasource_program:rule_spec_redfish_power_datasource_programs: plug-in 'redfish_power' already defined at cmk.plugins.redfish.rulesets.datasource_program:rule_spec_redfish_power_datasource_programs"), ValueError("cmk_addons.plugins.redfish.rulesets.redfish_ethernetinterfaces:rule_spec_discovery_redfish_ethernetinterfaces: plug-in 'discovery_redfish_ethernetinterfaces' already defined at cmk.plugins.redfish.rulesets.redfish_ethernetinterfaces:rule_spec_discovery_redfish_ethernetinterfaces"), ValueError("cmk_addons.plugins.redfish.rulesets.redfish_outlets:rule_spec_discovery_redfish_outlets: plug-in 'discovery_redfish_outlets' already defined at cmk.plugins.redfish.rulesets.redfish_outlets:rule_spec_discovery_redfish_outlets")]
-| cmk_addons.plugins.redfish.graphing.translation:translation_redfish_outlets: plug-in 'redfish_outlets' already defined at cmk.plugins.redfish.graphing.translation:translation_redfish_outlets
-| cmk_addons.plugins.redfish.graphing.metrics:metric_input_power: plug-in 'input_power' already defined at cmk.plugins.redfish.graphing.power:metric_input_power
-| cmk_addons.plugins.redfish.graphing.metrics:metric_output_power: plug-in 'output_power' already defined at cmk.plugins.redfish.graphing.power:metric_output_power
-| cmk_addons.plugins.redfish.graphing.metrics:metric_input_voltage: plug-in 'input_voltage' already defined at cmk.plugins.redfish.graphing.voltage:metric_input_voltage
-| cmk_addons.plugins.redfish.graphing.metrics:metric_media_life_left: plug-in 'media_life_left' already defined at cmk.plugins.redfish.graphing.ssddrives:metric_media_life_left
-| cmk_addons.plugins.redfish.graphing.metrics:metric_ssd_utilization: plug-in 'ssd_utilization' already defined at cmk.plugins.redfish.graphing.ssddrives:metric_ssd_utilization
-| cmk_addons.plugins.redfish.graphing.perfometer:perfometer_input_output_power: plug-in 'power_summary' already defined at cmk.plugins.redfish.graphing.power:perfometer_input_output_power
-|  01/09 Legacy check plug-ins...
-|  02/09 Rulesets...
-|  03/09 UI extensions...
-| Error loading rulespecs: [ValueError("cmk_addons.plugins.redfish.rulesets.datasource_program:rule_spec_redfish_datasource_programs: plug-in 'redfish' already defined at cmk.plugins.redfish.rulesets.datasource_program:rule_spec_redfish_datasource_programs"), ValueError("cmk_addons.plugins.redfish.rulesets.datasource_program:rule_spec_redfish_power_datasource_programs: plug-in 'redfish_power' already defined at cmk.plugins.redfish.rulesets.datasource_program:rule_spec_redfish_power_datasource_programs"), ValueError("cmk_addons.plugins.redfish.rulesets.redfish_ethernetinterfaces:rule_spec_discovery_redfish_ethernetinterfaces: plug-in 'discovery_redfish_ethernetinterfaces' already defined at cmk.plugins.redfish.rulesets.redfish_ethernetinterfaces:rule_spec_discovery_redfish_ethernetinterfaces"), ValueError("cmk_addons.plugins.redfish.rulesets.redfish_outlets:rule_spec_discovery_redfish_outlets: plug-in 'discovery_redfish_outlets' already defined at cmk.plugins.redfish.rulesets.redfish_outlets:rule_spec_discovery_redfish_outlets")]
-| cmk_addons.plugins.redfish.graphing.translation:translation_redfish_outlets: plug-in 'redfish_outlets' already defined at cmk.plugins.redfish.graphing.translation:translation_redfish_outlets
-| cmk_addons.plugins.redfish.graphing.metrics:metric_input_power: plug-in 'input_power' already defined at cmk.plugins.redfish.graphing.power:metric_input_power
-| cmk_addons.plugins.redfish.graphing.metrics:metric_output_power: plug-in 'output_power' already defined at cmk.plugins.redfish.graphing.power:metric_output_power
-| cmk_addons.plugins.redfish.graphing.metrics:metric_input_voltage: plug-in 'input_voltage' already defined at cmk.plugins.redfish.graphing.voltage:metric_input_voltage
-| cmk_addons.plugins.redfish.graphing.metrics:metric_media_life_left: plug-in 'media_life_left' already defined at cmk.plugins.redfish.graphing.ssddrives:metric_media_life_left
-| cmk_addons.plugins.redfish.graphing.metrics:metric_ssd_utilization: plug-in 'ssd_utilization' already defined at cmk.plugins.redfish.graphing.ssddrives:metric_ssd_utilization
-| cmk_addons.plugins.redfish.graphing.perfometer:perfometer_input_output_power: plug-in 'power_summary' already defined at cmk.plugins.redfish.graphing.power:perfometer_input_output_power
-| [redfish 2.3.76]: Ignoring problems (MKP will be disabled on target version)

This is not blocking the upgrade, however after the upgrade (which finishes as global status of OK, starting the site, trying to update the configuration of the site results in the following message:

If i check omd status , then the board is green.

I rolled back to a snapshot, retried it, but same result unfortunately.

I’m looking for directions as to troubleshoot this on a future attempt.

  • Glowsome

The first messages only comes from the redfish mkp what is later in the update process disabled.

The UI scheduler is here more the problem. You can check the scheduler on command line also his log is important.

Well after alot of time where i had no chance to reproduce the upgrade and the error finally got time to redo it (2.3.0p46 → 2.4.0p27), and what i found in /omd/sites/Mysite/tmp/run/ is the following issue:

OMD[Mysite]:~/tmp/run$ ll
total 16
-rw-r--r-- 1 Mysite Mysite   4 May  7 00:35 agent-receiver.pid
-rw-r--r-- 1 Mysite Mysite   4 May  7 00:35 automation-helper.pid
srwx------ 1 Mysite Mysite   0 May  7 00:35 automation-helper.sock=
-rw------- 1 Mysite Mysite   4 May  7 00:35 cmk-ui-job-scheduler.pid
srw-rw---- 1 Mysite Mysite   0 May  7 00:35 live=
lrwxrwxrwx 1 Mysite Mysite   8 May  7 00:35 live-tcp -> live-tls
drwxr-x--x 2 Mysite Mysite 120 May  7 00:35 mkeventd/
prw-rw---- 1 Mysite Mysite   0 May  7 00:35 nagios.cmd|
srwx------ 1 Mysite Mysite   0 May  7 00:35 redis=
-rw------- 1 Mysite Mysite   4 May  7 00:35 redis-server.pid
srw-rw---- 1 Mysite Mysite   0 May  7 00:35 rrdcached.sock=
srw------- 1 Mysite Mysite   0 May  7 00:35 ui-job-scheduler.sock=

All sockets have a ‘=’ added, and nagios.cmd has a pipe-character added ?
This compared to a testbox (already on 2.4)

-rw-r--r-- 1 testsite testsite    4 May  6 23:26 agent-receiver.pid
-rw-r--r-- 1 testsite testsite    4 May  6 23:27 automation-helper.pid
srwx------ 1 testsite testsite    0 May  6 23:27 automation-helper.sock
-rw------- 1 testsite testsite    4 May  6 23:27 cmk-ui-job-scheduler.pid
srw-rw---- 1 testsite testsite    0 May  6 23:27 live
lrwxrwxrwx 1 testsite testsite    8 May  6 23:26 live-tcp -> live-tls
-rw------- 1 testsite testsite 2087 May  7 00:00 logrotate.state
drwxr-x--x 2 testsite testsite  120 May  6 23:26 mkeventd
prw-rw---- 1 testsite testsite    0 May  6 23:27 nagios.cmd
srwx------ 1 testsite testsite    0 May  6 23:26 redis
-rw------- 1 testsite testsite    4 May  6 23:26 redis-server.pid
srw-rw---- 1 testsite testsite    0 May  6 23:26 rrdcached.sock
srw------- 1 testsite testsite    0 May  6 23:27 ui-job-scheduler.sock

if i revert the site to a pre-upgrade snapshot i still see sockets with added ‘=’ and nagios.cmd with an added pipe.

-rw-r--r-- 1 Mysite Mysite   4 May  7 00:56 agent-receiver.pid
srw-rw---- 1 Mysite Mysite   0 May  7 01:06 live=
lrwxrwxrwx 1 Mysite Mysite   8 May  7 00:56 live-tcp -> live-tls
drwxr-x--x 2 Mysite Mysite 120 May  7 00:56 mkeventd/
prw-rw---- 1 Mysite Mysite   0 May  7 00:56 nagios.cmd|
srwx------ 1 Mysite Mysite   0 May  7 00:56 redis=
-rw------- 1 Mysite Mysite   4 May  7 00:56 redis-server.pid
srw-rw---- 1 Mysite Mysite   0 May  7 00:56 rrdcached.sock=

But on that 2.3.0p46 - version i have no issues …

By the looks of the error after upgrade its understandable that if it is looking for a socket without the added ‘=’.
So did the upgrade somehow forgot to correct this or ?
Also i have looked at service config files, and their corresponding service(init.d)-file but was unable to find where the added ‘=’ is coming from.

BTW: Both boxes are same OS, same patchlevel ( RockyLinux 9.7)

Update: i’ve done some digging, and for some reason the added characters seem to relate to previously upgraded OS-settings where a ‘=’ indicates a socket, and a pipe indicates (obviously) indicates a pipe.

So then i had a look at logs, where no mentions were there in regards of unable to start, or having other issues.
So my mystery remains.. before i had to rollback and resume normal operations on a/the attempted -to-upgrade machine.

As this is sort-of frustrating, as it is blocking my move up to the next major version (2.4), and potentially to the most current release (2.5.0) all help is welcome regarding this.

  • Glowsome