I need some help, it’s about an automatic startup of my machines. I have an SAI, 3 NAS and various VM running under Xen Server.
My problem is when I have a power outage (blackout), the UPS automatically turns on all my NAS and my virtual machines, but since the NAS take time to turn on (they do their hard drive check) and the VM hard drives are inside the NAS , if the NAS aren’t ready, the virtual machines do not power on properly and fail due to missing hard drive. I would like to know how to use CheckMK to wake on lan (my CheckMK is on a separate server with no NAS so no problem) to VMs when NFS from my NAS is operational.
I tried to make a wake on lan script in the NAS but since those NAS starts the script before their NFS are ready it is not workable. And also I tried to search some script by Nagios but there’s no information…
It is my incidence because I would like that if there is a blackout, they would be able to automate the restart of the entire network structure. Any idea, any help or any tip, would be fine…
You might use alert handlers. One should be very careful with alert handlers, there have reportedly been users restarting their CMK server from alert handlers which is no good idea. Triggering a script that sends WoL packets until a host is pingable does not pose any risk.
Checkmk can send notifications by mail of state changes of host and services.
My suggestion is to have an IFTTT service check your mail for OK state mail of your NAS to trigger WOL script elsewhere. No actual knowledge how to do it myself, just playing muse.
I work with Vmware and there is VM startup delay which i can set for starting VM after outage. I didn’t work with Xen yet but maybe it has same feature.
In my opinion this is a job for an script in the usv-managent tool.
Better solutions have an option for running scripts.
Another solution may be us use a small linux-admin system with a local disc etc. for running all checks an tools needed for such jobs.
Ralf
Yep, this is the best solution, that’s why I need to make an automatic wake on lan if it’s possible in CheckMK because the day I go to buy a small local server I will put Checkmk and etcetera…
Do you have any link that explains usv-management tool? Or do you mean UPS? Because I’m from Spain and UPS in Spanish is “SAI” haha
Thanks for your reply!
It’s what I thought but… If I use a startup delay, NAS can start afterwards than the delay time and It points to the beginning of the problem… Anyways, that is my last option if I cannot make a deal with a better solution.
a)
NAS is starting after the blackout.
b)
Host with the VMs starts without starting the VMs using the NAS
c)
the admin VM starts
d)
the admin VM checks if the nas is online ans if the needed shares are available
e)
the admin Vm starts the production VMs using ssh or other tools
You could also create your own notification script, which sends the WOL packets, when your NFS file systems come online. If done cleverly, you could probably build this in a way, that the affected hosts have a check themselves on which you can alert and use that hostname as a target.
I have never done that, but it should be very much possible.
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed. Contact an admin if you think this should be re-opened.