I just installed the new version 2.3.52 and now I get an error when I try to retrieve the ILOs of my HPE G10+. But I also get the message on my two Cisco servers.
Did I configure something wrong? I updated from 2.3.45 to 2.3.52.
Can you please execute the agent on command line with āādebugā and ā-vvā switch?
The code that does the import from āredfish.messagesā is not active at the moment. Only preparation for the next versions.
I checked the import on both 2.2 and 2.3 - it was working without problem.
2.3
OMD[cmk]:~$ python3
Python 3.12.3 (main, May 7 2024, 15:13:53) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from redfish.messages import (
... get_messages_detail,
... get_error_messages,
... search_message,
... RedfishPasswordChangeRequiredError,
... RedfishOperationFailedError,
... )
>>>
2.2
OMD[cmk]:~$ python3
Python 3.11.5 (main, Nov 30 2023, 14:57:54) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from redfish.messages import (
... get_messages_detail,
... get_error_messages,
... search_message,
... RedfishPasswordChangeRequiredError,
... RedfishOperationFailedError,
... )
>>>
Please check your ā~/local/lib/python3/ā there should be no Redfish folder anymore with CMK 2.3.
If there is one it needs to be removed (old Redfish lib from 2.2 manually installed).
Some information for users of CMK 2.2
If you update to 2.2.52 version of Redfish plugin you will get some service names with different naming.
Now all the services names are the same for 2.2 and 2.3.
Sorry this is an incompatible change but it needed to be done at some point
Hi, firstly thank you very much for spending time on this plugin. Really appreciated! I was wondering how I can work around very slow BMCs. I have a few Gigabyte boards that I would like to monitor with the plugin but I am having trouble doing the inventory and I think itās because it just takes too long, e.g.:
$ agent_redfish -u xxx -s xxx -v --debug --timeout 30 n12345
INFO 2024-07-25 11:08:26 root: running file /omd/sites/hpcwatch1/lib/python3/cmk/special_agents/utils/agent_common.py
INFO 2024-07-25 11:08:26 root: using Python interpreter v3.11.5.final.0 at /omd/sites/hpcwatch1/bin/python3
INFO 2024-07-25 11:08:26 redfish: Redfish API
INFO 2024-07-25 11:08:26 redfish.rest.v1: Attempt 1 of /redfish/v1
INFO 2024-07-25 11:08:38 redfish.rest.v1: Response Time for GET to /redfish/v1: 11.636435125023127 seconds.
So that initial step already takes more than 10 seconds. Scanning the whole tree takes just over 2 minutes - is this a lost cause or can I do something?
It looks very bad. The initial fetch should be nearly immediately.
How long does the system needs if you only enable one section like here fan and temperature.
If this time is acceptable then i would do it this way.
You get the system roll-up state and the single fan and temperature services.
In case of an hardware failure beside the fans and temperatures you will get an message at the āSystem stateā service.
Hi @andreas-doehler well my first problem is that I am unable to inventory a node with the plugin enabled. Is there an internal timeout in checkmk for that process that will not wait for the plugin to finish and is there a way to increase that timeout?
If you get an timeout only with āFan and Temperaturesā active then it needs longer than 60 seconds.
That is normally the check timeout inside CMK.
With enterprise edition you can define your own timeouts for single hosts.
With RAW edition you need to manually change the Nagios core config.
With your system i would do the following steps.
define the rule as shown in my screenshot
on command line as site user do cmk --debug -vvI hostname
look at the needed time - you see this inside output
So the agent exits with code 1 - that call clearly does not wait long enough. By the way this is CEE so based on what you said I should be able to increase the timeout. Iāll take a look.
I got a little bit further, I ended up with a stack:
...
File "/omd/sites/hpcwatch1/local/lib/python3/cmk/base/plugins/agent_based/redfish_fans.py", line 36, in discovery_redfish_fans
for fan in fans:
TypeError: 'NoneType' object is not iterable
I fixed this in the code and the discovery is now finishing Thanks a lot for the help!
It would be nice if you can sent me a dump of your Redfish interface created with Redfish Mockup Creator.
What version of the plugin do you use?
The mentioned line is already fixed here with this commit.
As a new (not only time after signing up, but also interaction in the forum) user, it is possible, that you cannot initiate PM to avoid spam. Andreas should be able to start the conversation with you though.
We use Fujitsu Primergy Servers and the Redfish-Plugin is working very well.
But the Power Supply values seems to be incorrect.
This is an output from an running server.
Power supply 0-PSU1 0.0 Watts input, 0.0 Watts output, 0.0 V input, Capacity 2600.0 Watts, Typ CDR26214M3
Power supply 1-PSU2 0.0 Watts input, 0.0 Watts output, 0.0 V input, Capacity 2600.0 Watts, Typ CDR26214M3
Can you post or sent me the raw agent output of the power section?
If there are no values inside that are usable then the check cannot show any information beside the status data.
Iām redoing some of my redfish hosts in a new site and noticed something that Iād like to fix. Most of my redfish compatible hosts have two 10G SFP+ ports and two 1G ethernet ports. Weāre currently only using the first 10G port and the other three all show up in a warning state. How can I tell the system that the warnings are kind of a false positive, they arenāt connected and thatās the normal state at the moment? Itās not a big deal but it does add info that isnāt needed into the list of services that arenāt OK, messes up the signal-to-noise ratio.