[Check_mk (english)] OMD distributed monitoring

Hi there,

we are currently using OMD with distributed monitoring topology. It’s working fine until some of the slave cores instalations becomes to (network) unreachable, then is unreachable (web UI) master site too.

Master site OMD is trying connect to slave OMD without any timeout and in this time is Master Check_mk unusable.

Is there any easy way, how to fix this bug? Thanks :slight_smile: Used OMD v1.10 check_mk 1.2.2p2

Regards

Z. D.

We have had the same issues for years. We resolved this by writing a small script that would set the remote site to disabled until the link came back up. After we did this some time back no more issues. And mk is nice and fast for all other working sites.

Cheers
Alex

···

On 15/04/2015 5:52 PM, “Zdeněk Dlouhý” zdenek.dlouhy@casablanca.cz wrote:

Hi there,

we are currently using OMD with distributed monitoring topology. It’s working fine until some of the slave cores instalations becomes to (network) unreachable, then is unreachable (web UI) master site too.

Master site OMD is trying connect to slave OMD without any timeout and in this time is Master Check_mk unusable.

Is there any easy way, how to fix this bug? Thanks :slight_smile: Used OMD v1.10 check_mk 1.2.2p2

Regards

Z. D.



checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

We’ll meet in Munich for the 2nd Check_MK Conference!

Book your place now and be part of it.

October 18th-20th, 2015

http://mathias-kettner.de/conference

Thats no bug :slight_smile:
You need to define a status host on your central site for every connected slave.
This status host must be then inserted at the distributed monitoring configuration for every slave.

If this status host is not reachable then the central web UI don’t try to reach the slave.
With such an configuration you prevent the timeout problem.

Best regards

Andreas

···

Zdeněk Dlouhý zdenek.dlouhy@casablanca.cz schrieb am Mi., 15. Apr. 2015 um 09:52 Uhr:

Hi there,

we are currently using OMD with distributed monitoring topology. It’s working fine until some of the slave cores instalations becomes to (network) unreachable, then is unreachable (web UI) master site too.

Master site OMD is trying connect to slave OMD without any timeout and in this time is Master Check_mk unusable.

Is there any easy way, how to fix this bug? Thanks :slight_smile: Used OMD v1.10 check_mk 1.2.2p2

Regards

Z. D.



checkmk-en mailing list

checkmk-en@lists.mathias-kettner.de

http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en

We’ll meet in Munich for the 2nd Check_MK Conference!

Book your place now and be part of it.

October 18th-20th, 2015

http://mathias-kettner.de/conference