Check_mk service is stale no data has been received with in the last 1.5 check periods

Hi Team,

we have recently upgraded our check_mk from 1.4.0P8 to 1.6.0P11
all is running good but i have a strange behaviour : all services are
marked as “This service is stale, no data has been received within the
last 1.5 check periods” and, values of perdata and graph are good !

Please help me to resolve this issue.

Regards,
Krishna

Is this on all your services?
What version do you use - Raw or Enterprise?
If it is the Raw edition and your system is bigger, it can take up to 30 minutes after a core restart that all stale services are gone.
Enterprise should fix this after 2-5 minutes.

@andreas-doehler Yes we are using Raw edition 1.6.0P11 and we monitoring around 1900+ hosts . if we enable distributed monitoring will it helps us to sort this issue ?

The problem is the old Nagios core. If this core runs without restarts most is find but if you restart often or it is down a longer time then the complete schedule is invalid and the core needs to reschedule all checks. The default time horizon is if i remember it correctly around 30 minutes.

Distributed monitoring can help you in this case also as you have smaller instances. These smaller instances can restart quicker, config generation takes not so much time and so on.

But this has not directly something to do with the stale problem.

Is there any way to upgrade our nagios core in Raw edition 1.6.0P11 ?

Short answer - No

If you monitor around 2k hosts i would split the system in around 4 or 5 instances - every instance with 400-500 hosts. This should perform way better also on one big hardware system beneath.

I have 2 or 3 such systems running with big multiple sites on one hardware and it is better than one big system.

1 Like

@andreas-doehler thank you so much for your answers, they are really very helpful :slight_smile:

How do you see all hosts in one site ?
Is one site master and other 4 slaves ?

Can you maybe say how is the hardware on that system ?

The master is a virtual machine - checking nothing only do the configuration and presentation.
Hardware on the real monitoring host or the master VM?

You said we can have on each site e.g. 500 hosts so we have here 4 “skaves” Sites on one hardware, how are the specification ?

Then you have 1 site “master” which have above 4 sites as slave right ?

Or how do you mean that ?

Thanks again

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.