Last week we looked at a Linux VM with 32 GB of RAM. Memory usage - according to Check-MKs “RAM + Swap overview” - was ridiculously low, about 0.9 of 32 GB. We reduced the RAM to 8 GB. Since then users were complaining about lousy performance.
Turned out: this VM runs an application that heavily uses mmap(2) to memory-map database files summing up to 90 GB. Of course performance dropped because now the VM was paging a lot more.
I can see the memory mapping using top: the database process has a VIRT allocation of 97.1g.
My question: is there any way Check-MK can stop me from falling into that trap again? How can I see how much memory is mapped to files and how active that memory is?
I think mmap()ed data is recorded in the “Mapped data” metric of that check which has not found its way into a graph. I do not know the reason of this decision here.
You can enable that graph by adding a new file $OMD_ROOT/local/share/check_mk/web/plugins/metrics/mem_mapped.py with this line as content: