Need help with customizing data retention

Hello there,

If you are willing to help me, I’ve few questions regarding data retention with pnp4nagios :

  1. What is the approximative size of a single data?

I’ve done an approximative calculation based on the logs size divided by the number of checks historized and it seems to be around 100 bytes. Could you confirm that?

  1. Based on my approximative result and my needs, i’ve calculated the approximative disk space required with the following maths :

Size_Needed = (Nb_Service * Nb_Host * Data_Size * (Data_Retention_Time / Data_Interval)) / Size_Unit

Nb_Service = an average number of service monitored per host
Nb_Host = number of hosts monitored
Data_Size = single data size (one result of one service check) (in bytes)
Data_Retention_Time = how long data is retained (in seconds)
Data_Interval = Check interval (in seconds)
Size_Unit = 10^9 (Value to convert bytes to Gigabytes)

This returned me 313 GB of disk space needed for a one year retention with a check interval of 60s on my network.

So here are my 2 questions on this point :

  1. Could you confirm my maths?
  2. Do you have any experience or knowledge concerning the performance/reactivity impact on the monitoring system? (Maybe some “best practise” to respect, like not going above 50 Gb of data retention…)

PS : I’m running 15K RPM disks.

Kind regards,

Jonathan

Your questions are answered in this manual article: https://checkmk.com/cms_graphing.html

1 Like

You forgot to say how many services you have and what type of service.
As addition to @r.sander it is important to see how many performance values your service gives back in the average of your systems.

I took only that you want 1 year with 1 minute resolution - this gives me a minimum amount of data per performance value of ~11MB - to compare with the default settings there you have only 400 kB per performance value. Your storage needs nearly 30 times the space as the default settings.
One Interface check has 11 values → 122MB per Interface → Switch with 48 ports → ~5,5GB per Switch

Thanks ! This answered almost all of my questions.

Hello Andreas,

Thanks again for your help !

Now i’ve read the article mentionned by R.Sander, I’m understanding your calculation.

I also understood that with a single phase (1 year with 1 min. reso), the disk I/O would be reduced. Nevertheless, I’m wondering if it would be an impact on the monitoring system performance (displaying graphs, etc.) if, let’s say, we have 500 GB of RRD?

Looking forward your reply.

SSD Storage only then it is only a size problem :smiley:

Alright.

Your help is much appreciated again.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.