DRBD a pain or a blessing

Hi Forum,

We get occasional failover of our cluster and we are eager to know the root cause
Can I know your experience, what are possible triggers that cause a failover and how do you detect these in the logs ?
(I know don’t cluster virtual appliance, but that is info that was NOT available while we were installing)

Hi Philippe,

i would remove the clustering and rely on the HA features of the hyper visor:

https://checkmk.atlassian.net/wiki/spaces/KB/pages/9471820/Why+you+should+not+cluster+the+virtual+appliance

Problems might be VM Snapshots / VM Backups, Network Interruption, Disk IO latency.
In the appliance there is a download option for the cluster log and the kernel log.
In detail you will have to analyze the drbd, pacemaker and corosync entries.

1 Like