Hi, I wanted to contribute by creating merge request, but I do not have Nexus credentials so I am writing this post.
Checkmk version: Checkmk Raw Edition 2.4.0p13
We are using check-mk to get real time view of how many NVIDIA GPUs do we have across our linux servers.
After installing official mk_inventory.linux I saw that we do not see GPUs of one server.
After more debugging I saw that other servers output of lspci command is like this:
01:00.0 VGA compatible controller: NVIDIA Corporation GA102GL [RTX A5000] (rev a1)
but on a server that GPUs are not present it is like this:
0000:2a:00.0 3D controller: NVIDIA Corporation Device 2bb3 (rev a1)
I now that this function parses this output:
def parse_lnx_video(string_table: StringTable) -> Section:
parsed_section: dict[str, GraphicsCard] = {}
current_name: str = ""
for line in string_table:
if len(line) <= 1:
continue
if "VGA compatible controller" in line[-2]:
current_name = line[-1].strip()
if current_name:
parsed_section.setdefault(current_name, GraphicsCard(name=current_name))
elif current_name:
if line[0] == "Subsystem":
parsed_section[current_name].subsystem = line[1].strip()
elif line[0] == "Kernel driver in use":
parsed_section[current_name].driver = line[1].strip()
return parsed_section
As we can see the output has to have VGA compatible controller part, but ours GPUs that have 3D controller were installed normally and are used in the same way that the others so I am not sure why this is a case.
Since I do not have Nexus credentials and wanted to fix it in my local environment I updated mk_inventory.linux. This is revelant part (in our usecase we are interested only in NVIDIA):
section_lnx_video() {
vgas=$(lspci | grep -i NVIDIA | grep -v -i AUDIO | cut -d' ' -f1)
[ -n "$vgas" ] || return 0
echo "<<<lnx_video:sep(58)>>>"
echo "$vgas" | while IFS= read -r vga; do
# Clean PCI IDs
pci_start=$(echo "$vga" | sed 's/^0000://') # for start of line
pci_end=$(echo "$vga" | sed 's/^0000://; s/:/./g') # for end of line
# Read lspci output line by line
first_line_done=0
lspci -v -s "$vga" | while IFS= read -r line; do
if [ "$first_line_done" -eq 0 ]; then
# Replace PCI ID at start, append cleaned PCI at end
# Replace '3D controller' with 'VGA compatible controller (3D controller)'
first_line=$(echo "$line" | sed "s/^$vga/$pci_start/")
first_line=$(echo "$first_line" | sed 's/3D controller/VGA compatible controller (3D controller)/')
echo "$first_line $pci_end"
first_line_done=1
else
echo "$line"
fi
done
done
}
Because of this change we now see those GPUs in inventory view.
Could someone take a look and tell what else can I do? How are those Nexus credentials created or how else can I contribute this change to our community?