Lenovo XCC, again

CEE 2.3.0p19

I’ve read every thread on Lenovo XCC, but i still cannot get it to work :).

So.

Option 1

https://exchange.checkmk.com/p/redfish

I created a read only user, for redfish:

This is where i don’t know what to do.

I created a rule here, with the user:

And fails:
image

I know i’m doing something wrong.
What please?


Option 2

https://exchange.checkmk.com/p/lenovo-xclarity

Same. I must be doing something wrong.

Looking forward to hearing from you soon!

Thanks!!

Some console tests:

# curl --insecure -v https://10.200.12.55/redfish/v1

  • Trying 10.200.12.55:443…
  • Connected to 10.200.12.55 (10.200.12.55) port 443 (#0)
  • ALPN, offering h2
  • ALPN, offering http/1.1
  • CAfile: /etc/pki/tls/certs/ca-bundle.crt
  • TLSv1.0 (OUT), TLS header, Certificate Status (22):
  • TLSv1.3 (OUT), TLS handshake, Client hello (1):
  • TLSv1.2 (IN), TLS header, Certificate Status (22):
  • TLSv1.3 (IN), TLS handshake, Server hello (2):
  • TLSv1.2 (OUT), TLS header, Finished (20):
  • TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
  • TLSv1.2 (OUT), TLS header, Certificate Status (22):
  • TLSv1.3 (OUT), TLS handshake, Client hello (1):
  • TLSv1.2 (IN), TLS header, Certificate Status (22):
  • TLSv1.3 (IN), TLS handshake, Server hello (2):
  • TLSv1.2 (IN), TLS header, Unknown (23):
  • TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
  • TLSv1.2 (IN), TLS header, Unknown (23):
  • TLSv1.3 (IN), TLS handshake, Certificate (11):
  • TLSv1.2 (IN), TLS header, Unknown (23):
  • TLSv1.3 (IN), TLS handshake, CERT verify (15):
  • TLSv1.2 (IN), TLS header, Unknown (23):
  • TLSv1.3 (IN), TLS handshake, Finished (20):
  • TLSv1.2 (OUT), TLS header, Unknown (23):
  • TLSv1.3 (OUT), TLS handshake, Finished (20):
  • SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
  • ALPN, server accepted to use http/1.1
  • Server certificate:
  • subject: C=US; ST=NC; L=RTP; O=Lenovo; CN=XCC-7D75-JZ004WWX
  • start date: Jul 26 07:03:01 2024 GMT
  • expire date: Jul 26 07:03:01 2027 GMT
  • issuer: C=US; ST=NC; L=RTP; O=Lenovo; CN=XCC-7D75-JZ004WWX
  • SSL certificate verify result: self-signed certificate (18), continuing anyway.
  • TLSv1.2 (OUT), TLS header, Unknown (23):

GET /redfish/v1 HTTP/1.1
Host: 10.200.12.55
User-Agent: curl/7.76.1
Accept: /

  • TLSv1.2 (IN), TLS header, Unknown (23):
  • Mark bundle as not supporting multiuse
    < HTTP/1.1 308 Permanent Redirect
    < Date: Fri, 25 Oct 2024 20:45:11 GMT
    < Content-Type: application/json
    < Transfer-Encoding: chunked
    < Connection: keep-alive
    < Location: /redfish/v1/
    < OData-Version: 4.0
    < Content-Language: en
    < Cache-Control: no-store
    < Server: XCC Web Server
    < Strict-Transport-Security: max-age=31536000; includeSubDomains
    < Content-Security-Policy: default-src ‘self’; connect-src *; script-src ‘self’; img-src ‘self’ data:; style-src ‘self’; font-src ‘self’; child-src ‘self’; object-src ‘none’; frame-ancestors ‘none’
    < X-XSS-Protection: 1; mode=block
    < X-Content-Type-Options: nosniff
    < Cache-Control: no-cache, no-store, must-revalidate, private
    < X-Frame-Options: DENY
    < Referrer-Policy: same-origin
    < X-Permitted-Cross-Domain-Policies: value
    < X-Download-Options: value
    <
    {“Message”:“You are being redirected to /redfish/v1/”}
  • TLSv1.2 (IN), TLS header, Unknown (23):
  • Connection #0 to host 10.200.12.55 left intact

cmk --debug --v

No piggyback files for ‘10.200.12.55’. Skip processing.
[cpu_tracking] Stop [7f478e4e2270 - Snapshot(process=posix.times_result(user=0.009999999999999787, system=0.0, children_user=0.0, children_system=0.0, elapsed=0.010000001639127731))]
[special_redfish] redfish.rest.v1.RetriesExhaustedError(!!), [piggyback] Success (but no data found for this host), execution time 15.1 sec | execution_time=15.070 user_time=0.010 system_time=0.000 children_user_time=0.570 children_system_time=0.040 cmk_time_ds=14.450 cmk_time_agent=0.000
Agent exited with code 1: Agent failed - please submit a crash report! (Crash-ID: 9eeb3d02-9311-11ef-9299-005056801e64)

Traceback (most recent call last):
File “/omd/sites/hard2/local/lib/python3/urllib3/connectionpool.py”, line 468, in _make_request
six.raise_from(e, None)
File “”, line 3, in raise_from
File “/omd/sites/hard2/local/lib/python3/urllib3/connectionpool.py”, line 463, in _make_request
httplib_response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/lib/python3.12/http/client.py”, line 1428, in getresponse
response.begin()
File “/omd/sites/hard2/lib/python3.12/http/client.py”, line 331, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/lib/python3.12/http/client.py”, line 292, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), “iso-8859-1”)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/lib/python3.12/socket.py”, line 707, in readinto
return self._sock.recv_into(b)
^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/lib/python3.12/ssl.py”, line 1252, in recv_into
return self.read(nbytes, buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/lib/python3.12/ssl.py”, line 1104, in read
return self._sslobj.read(len, buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/omd/sites/hard2/local/lib/python3/requests/adapters.py”, line 667, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/urllib3/connectionpool.py”, line 802, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/urllib3/util/retry.py”, line 552, in increment
raise six.reraise(type(error), error, _stacktrace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/urllib3/packages/six.py”, line 770, in reraise
raise value
File “/omd/sites/hard2/local/lib/python3/urllib3/connectionpool.py”, line 716, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/urllib3/connectionpool.py”, line 470, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File “/omd/sites/hard2/local/lib/python3/urllib3/connectionpool.py”, line 358, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host=‘10.200.12.55’, port=443): Read timed out. (read timeout=3)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/omd/sites/hard2/local/lib/python3/redfish/rest/v1.py”, line 915, in _rest_request
resp = self._session.request(method.upper(), “{}{}”.format(self.__base_url, reqpath), data=body,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/requests/sessions.py”, line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/requests/sessions.py”, line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/requests/adapters.py”, line 713, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host=‘10.200.12.55’, port=443): Read timed out. (read timeout=3)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “/omd/sites/hard2/lib/python3/cmk/special_agents/v0_unstable/agent_common.py”, line 149, in _special_agent_main_core
return main_fn(args)
^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/cmk_addons/plugins/redfish/special_agents/agent_redfish.py”, line 737, in agent_redfish_main
get_information(redfishobj, sections)
File “/omd/sites/hard2/local/lib/python3/cmk_addons/plugins/redfish/special_agents/agent_redfish.py”, line 577, in get_information
firmwares = fetch_data(
^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/cmk_addons/plugins/redfish/special_agents/agent_redfish.py”, line 144, in fetch_data
response_url = redfishobj.get(url, None)
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/redfish/rest/v1.py”, line 633, in get
return self._rest_request(path, method=‘GET’, args=args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/redfish/rest/v1.py”, line 1118, in _rest_request
return super(HttpClient, self)._rest_request(path=path, method=method,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/omd/sites/hard2/local/lib/python3/redfish/rest/v1.py”, line 959, in _rest_request
raise RetriesExhaustedError() from cause_exception
redfish.rest.v1.RetriesExhaustedError(!!)

Two things.
As you are using CMK 2.3, you don’t need to redfish python library anymore inside your ~/local/lib/python3/ folder.

But number two and more important is here this problem

This means the special agent has a problem to fetch the firmware data.
Please deactivate the “Firmware” module inside the special agent configuration.
I’m working on practical permanent solution for this problem. It also happens often on Dell iDRAC.

All the other configuration looks fine.

2 Likes

Andreas, i’ve been on this forum for almost 10 years now (April 2015).
You’ve been a fucking angel on the shoulder.

So… Lenovo XCC.

  1. No need of using Lenovo xClarity special agent anymore.
  2. JUST redfish with readonly user.

And, YES: I disabled “Firmware versions” and it fucking worked.

I’m seeing literally almost 70 services.

Fabulous.

Have a good one!

3 Likes

@andreas-doehler , i have one more question about this.

You said:
“don’t need to redfish python library anymore inside your ~/local/lib/python3/ folder”

Can you elaborate?

  1. I don’t need to update redfish api using the latest from here Checkmk Exchange

Or…

  1. I don’t need to do this: pip3 install ‘urllib3<2’ redfish

If the answer is 2…
my second question is:

shall i keep the the redfish api updated, using the one from check_mk
exchange? or no need → just wait for next cmk 2.3 release?

Thanks!!

This is not needed anymore.

I would recommend to update to the latest mkp - it should be found on the CMK exchange or my Github page.

Fabulous.

Last one, i promise.

I know there can be obvious incompatible stuff, but, theorically, can i use
redfish api mkp on CEE 2.2 ? Example: the latest version: 2.3.60.

Or should i use it exclusively on CEE 2.3 ?

For 2.2 you can use the mkp with version number 2.2.60 - normally you find the same mkp with 2.3 and 2.2 in name.
But keep in mind for 2.2 the Python redfish library needs to be there.

But… will you still maintain it?
I thought you were focusing only on 2.3.

It’s the same then?
In terms of monitoring stuff.

At the moment I try to keep booth versions in sync. It can happen that the 2.2 takes some days more for a new release than 2.3. But that’s all.

1 Like

@andreas-doehler … hello there.
2 things, when you have some time.

  1. On DELL iDRACs, i could see server power consumption on 1 service (SNMP). Example: 500 W. On XCC i’m seeing 2 services, one for each PSU (1 - 250 W, 2 - 250 W). Do you have a plan to put power consumption on 1 service? it would be fantastic.

  2. I’m seeing this 2 x WARNING on every server:

Any hint on that?

Thank you!!!

For number 2 - this looks like a raid controller with defect cache battery.

Hi there!!
This are all brand new lenovo, so, it’s odd.

There was indeed only 1 warning on XCC about “threshold exceeded” referring to amount of boots.

OK, i reset BMC on each one. System state is now green.

The one about RAID, still warning. Can i help you somehow? Giving you the output of “cmk -d” ?

Thanks!

From the output only the section with the raid controller would be relevant.
If you search inside the output for “RAID_Slot16” then you should find the correct section.
Or inside the extended info of this check you will find some more information.

Sorry the delay.
So, may be this can help.

<<<redfish_storage:sep(0)>>>
{“@odata.context”: “/redfish/v1/$metadata#Storage.Storage”, “@odata.etag”: “"f38d9b6f5da02ff5c8984"”, “@odata.id”: “/redfish/v1/Systems/1/Storage/RAID_Slot16”, “@odata.type”: “#Storage.v1_13_0.Storage”, “Actions”: {“Oem”: {“#LenovoStorage.PrefetchSMARTData”: {“target”: “/redfish/v1/Systems/1/Storage/RAID_Slot16/Actions/Oem/LenovoStorage.PrefetchSMARTData”, “title”: “PrefetchSMARTData”}}}, “Description”: “This resource is used to represent a storage for a Redfish implementation.”, “Drives”: [{“@odata.id”: “/redfish/v1/Systems/1/Storage/RAID_Slot16/Drives/Disk.M.2_Bay_0”}, {“@odata.id”: “/redfish/v1/Systems/1/Storage/RAID_Slot16/Drives/Disk.M.2_Bay_1”}], “Drives@odata.count”: 2, “EncryptionMode”: “Disabled”, “Id”: “RAID_Slot16”, “Links”: {“Enclosures”: [{“@odata.id”: “/redfish/v1/Chassis/1”}], “Enclosures@odata.count”: 1}, “Name”: “RAID Storage”, “Status”: {“Health”: “OK”, “HealthRollup”: “OK”, “State”: “Enabled”}, “StorageControllers”: [{“@odata.id”: “/redfish/v1/Systems/1/Storage/RAID_Slot16#/StorageControllers/0”, “AssetTag”: “”, “CacheSummary”: {“PersistentCacheSizeMiB”: 0, “Status”: {“State”: “Disabled”}, “TotalCacheSizeMiB”: 0}, “FirmwareVersion”: “52.27.0-5389”, “Identifiers”: [{“DurableName”: “26B6006F-4801-44EB-AD19-745D22DEAADB”, “DurableNameFormat”: “UUID”}], “Location”: {“Info”: “Slot 16”, “PartLocation”: {“LocationOrdinalValue”: 16, “LocationType”: “Slot”, “ServiceLabel”: “PCI 16”}}, “Manufacturer”: “Lenovo”, “MemberId”: “0”, “Model”: “SAS3808N”, “Name”: “ThinkSystem M.2 RAID B540i-2i SATA/NVMe Enablement Kit”, “Oem”: {“Lenovo”: {“@odata.type”: “#LenovoStorage.v1_0_0.LenovoStorageController”, “FirmwareDeviceOrderEnabled”: false, “MaxStripeSizeBytes”: 65536, “MinStripeSizeBytes”: 65536, “Mode”: “RAID/JBOD”, “SupportedRaidLevels”: “0/1/10/00”, “SupportedRaidLevels@Redfish.Deprecated”: “The property is deprecated. Please use SupportedRAIDTypes instead.”}}, “PartNumber”: “SR17B71469”, “SKU”: “03LD926”, “SerialNumber”: “L2HF43M00H6”, “Status”: {“Health”: “OK”, “State”: “Disabled”}, “SupportedControllerProtocols”: [“PCIe”], “SupportedDeviceProtocols”: [“SATA”, “SAS”, “NVMe”], “SupportedRAIDTypes”: [“RAID0”, “RAID1”, “RAID10”, “RAID00”]}], “StorageControllers@odata.count”: 1, “StoragePools”: {“@odata.id”: “/redfish/v1/Systems/1/Storage/RAID_Slot16/StoragePools”}, “Volumes”: {“@odata.id”: “/redfish/v1/Systems/1/Storage/RAID_Slot16/Volumes”}}

Another model.

Same issue.

I found this… phrase.

Is a known quirk in many Redfish implementations, especially with Dell iDRACs and some Lenovo/XClarity systems, where Redfish reports a storage resource as “disabled” even though it is actually functioning normally.

If it is reported as “Disabled” you can do nothing from the check side.
What does say the web interface from such a device to this controller?
It uses normally the same data as you get.