That’s why i said, first install a clean 2.3 and check if this works as expected.
I don’t know what was the original situation in your system.
Is it a distributed system, how many mkp’s are installed and so on.
That is not an easy task that can be done on the fly.
For the password problem - what CMK version is this site running?
I will create a new test site tomorrow then! Originally it was working using your older plugin for the HPE iLO as well as an older Redfish plugin, which was specific to CMK 2.1. We then upgraded to 2.3 in the following method as per support’s advice: 2.1.0p32 > 2.1.0p44 > 2.2.0p27 > 2.3.0p6.
Then I tried to use the built-in plugin, which had these issues straight out of the box.
As I said though I will see if these issues occur on a clean site as per your reccomendations, since the upgrade path we took may have introduced a lot of deeper issues!
We are currently on 2.3.0p6 and running a single instance/site on a self-managed Ubuntu LTS 22.04 VM.
High load on the plugin should normally not exists.
The biggest difference between the old iLO Redfish and the generic Redfish is the session cache. This cache file is written with the host IP as unique identifier.
If now the same user and same IP is used by another monitoring object (should not be the case), then it can have problem. But this also should only affect single object and not all.
So I added one more ILO, and it immediately started crashing. It didn’t crash at all overnight with just a single host, but adding a second it started crashing.
I am going to try removing the host that had no issues, and seeing what happens to the new one I added today.
Hello, I’m facing the same problem as mentioned above. Updated from 2.2 to 2.3 and suddenly all redfish checks went UNKNOWN.
Installed the newest release of the plugin and I receive the same error message.
Any chance that you not removed the before installed Python packages.
With 2.1/2.2 you had to install extra Python packages with “pip install redfish ‘urllib3<2’”. These packages should be removed before or after upgrade to 2.3 as they are included now.
Also, please don’t use SNMP at the same time on this management interface, only the special agent.
This is not an upgraded site from 2.2 where redfish plugin was working before or?
Normally there are way more libs installed.
If it is a clean installation for 2.3 then i don’t know where the “normalizer” comes from. Looks strange.
I did go in and remove a LOT of python packages, since the upgrade failed multiple times until I did this. I did not know where most of these came from either.
I did take a dump of the original contents of the python plugins before I eviscerated them:
The normalizer plugin got uninstalled, but left some weird folder names there as you see “~harset_normalizer”. I tried removing it but it wouldnt! Wouldnt be causing a problem would it do you think?
Just adding more info here as I’m still troubleshooting.
We have added a few more ILO hosts. What I am seeing I can only explain as weird…
One of the ILO’s we added, is not having any issues whatsoever with the Redfish agent crashing. As soon as I added another one, this started having issues straight away. The weird issue is the original ILO is still fine. I am now wondering if anything is wrong with the ILO’s themselves, but I’ve rebooted the ILO and even gone as far as to reboot the host, but nothing improved.
The alert history for the problematic ILO reveals some interesting timings - it seems to be triggering alerts almost exactly every 5 minutes:
I’ve added 2 more ILOs from a completely different site, and these dont seem to be having any issues either.
It should be noted that ALL of these ILO’s have been very recently upgraded to the latest ILO firmware from HPE (v3.04) - but they are all setup the same.
The ILO’s having issues are all on the same network as the CheckMK server (even on the same subnet), so I dont see this being a network issue. The working ones are on a different subnet, differnt site entirely and are reachable through an SDWAN VPN.
Could there be an issue with these ILO’s all along rather than the CheckMK server or Redfish?
If so I dont know what, since I’ve tried rebooting them and they are all setup correctly.