Hello,
I used official mk_inventory.linux plugin from my local check_mk server.
I saw that on the host machine this plugin shows only one GPU of type X, but I know that it has 8 identical of such type.
This is part for gpu:
section_lnx_video() {
# Collect VGAs if they are present
vgas="$(lspci | grep VGA | cut -d" " -f 1)"
[ -n "$vgas" ] || return
echo "<<<lnx_video:sep(58)>>>"
printf '%s\n' "$vgas" | while IFS= read -r vga; do
lspci -v -s "$vga"
done
}
This code outputs those 8 cards on host machine without a problem. I believe that since those are the same types of cards the view in check_mk collapses those 8 cards into one row.
Can I somehow add some id for those cards here so that check_mk will not collapse the view?
This string is split at “:” and the second-to-last field is extracted as a key for the inventory dictionary. In the example this is “Microsoft Corporation Hyper-V virtual VGA (prog-if 00 [VGA controller])”.
So if you have 8 vga cards that have the same string in this filed you’ll only get the values auf the last one in your inventory.
Because 2.3 will be out of active maintenance in a month I experimented with 2.4 which has the same issue.
We will see what will happen with my PR #861 to add the slot id which should fix this issue.
Maybe you could add your lnx_video section output to the PR to have more test data.
Thanks to this (PCI) ID at the end I now see all 8 cards in inventory view.
Now I am not sure If I should wait for the official patch or just send this changed file to every host that we have. How long would we have to wait for official patch?
Considering that 2.3 will be out of active maintenance in a month I will also update it to 2.4 in the meantime.
And do you know if after changing major version I now have to reinstall agents on hosts being monitored? I can not find any information about it and I also did not experience any bugs with Version: 2.3.0p30 agent on monitored hosts.
@dandon223 I would recommend updating the agents after Checkmk upgrades. But you would need to redo your modification for the inventory until there is an upstream change for it
@Sara or @martin.hirschvogel Who is the person to connect to when the PR-CI tests in Github fail? Some of them seem to be flaky and it would be great to retrigger just an individual test if necessary. Although for the formating test im not sure if the test itself does have a problem. In my repo it passes but it cannot download one artefact:
WARNING: Download from https://artifacts.lan.tribe29.com/repository/upstream-archives/github.com/llvm/llvm-project/releases/download/llvmorg-19.1.7/LLVM-19.1.7-Linux-X64.tar.xz failed: class java.io.IOException Connect timed out
Meanwhile in the checkmk repo the download seems to work but the test fails because it runs out of diskspace:
ERROR: /home/runner/.cache/bazel/_bazel_runner/aed53f964069daa6ab471b6b9883c077/external/bazel_tools/tools/build_defs/repo/http.bzl:139:45: An error occurred during the fetch of repository 'llvm_linux_x86_64+':
Traceback (most recent call last):
File "/home/runner/.cache/bazel/_bazel_runner/aed53f964069daa6ab471b6b9883c077/external/bazel_tools/tools/build_defs/repo/http.bzl", line 139, column 45, in _http_archive_impl
download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error extracting /home/runner/.cache/bazel/_bazel_runner/aed53f964069daa6ab471b6b9883c077/external/llvm_linux_x86_64+/temp4487391751063356141/LLVM-19.1.7-Linux-X64.tar.xz to /home/runner/.cache/bazel/_bazel_runner/aed53f964069daa6ab471b6b9883c077/external/llvm_linux_x86_64+/temp4487391751063356141: write (No space left on device)
ERROR: no such package '@@llvm_linux_x86_64+//': java.io.IOException: Error extracting /home/runner/.cache/bazel/_bazel_runner/aed53f964069daa6ab471b6b9883c077/external/llvm_linux_x86_64+/temp4487391751063356141/LLVM-19.1.7-Linux-X64.tar.xz to /home/runner/.cache/bazel/_bazel_runner/aed53f964069daa6ab471b6b9883c077/external/llvm_linux_x86_64+/temp4487391751063356141: write (No space left on device)
ERROR: /home/runner/work/checkmk/checkmk/bazel/tools/format/BUILD:56:16: //bazel/tools/format:format_C++_with_clang-format.check depends on @@llvm_linux_x86_64+//:bin/clang-format in repository @@llvm_linux_x86_64+ which failed to fetch. no such package '@@llvm_linux_x86_64+//': java.io.IOException: Error extracting /home/runner/.cache/bazel/_bazel_runner/aed53f964069daa6ab471b6b9883c077/external/llvm_linux_x86_64+/temp4487391751063356141/LLVM-19.1.7-Linux-X64.tar.xz to /home/runner/.cache/bazel/_bazel_runner/aed53f964069daa6ab471b6b9883c077/external/llvm_linux_x86_64+/temp4487391751063356141: write (No space left on device)
yes, my workaround for the tests running in Github actions was merged last week. Now that the tests have a chance to finish successfully I updated all of my other PRs. The PR for this specific issue now got a “tracked” label and I wait for feedback now.