Power surge kills UPS/RAID Array and more

Ever since we moved into the village of Eyke we’ve suffered with power cuts and power surges. It’s been that bad that we have a number of uninterruptible power supplies (UPS) dotted around the house to keep important things running when the power goes out.

Of late it’s been getting worse, not just the power cuts but, the power bouncing on and off very quickly for period of 10-15secs when the power comes back on. Unfortunately we had a particularly bad power bounce when the power came back on and it killed the main UPS for the IT equipment rack and also took out my RAID storage array that I use for backups.

On top of this the main server computer also took a hit and its solid state (SSD) drives started to fail. This left me in a position where I had no backups to recover from and had to get all the data off the running virtual machines (VMs) before the SSDs failed.

My old server that I decommissioned some months ago was now my radio shack PC and so had a desktop operating system on it and lots of HAM radio software installed and configured but, I needed to press it back into service as a server again, very quickly!

So after backing up the desktop data I rebuilt the computer as a server again and began the tedious job of building new VMs and migrating the configuration and data over from the old VMs.

You’re probably wondering why I didn’t just transfer the VMs over hole to the replacement server?
To do this I’d need to shut them down to get a clean snapshot however, when I tried it with a small, unimportant VM it became corrupt during the shutdown process and could no longer be transferred to the replacement server.

Not wanting to take the risk with any of the other VMs due to having lost all the backups, I decided to replicate all the VMs manually. Needless to say this isn’t a 5min job!

So, after a rather long week rebuilding everything I now have all the services up and running on the replacement server and the damaged server ready to be stripped down to an empty case and rebuilt from scratch.

This has meant that at times my M0AWS Blog, The Matrix server and other online services have been offline for short periods but, sadly there was nothing I could do about it. Unfortunately the national grid/power companies take no responsibility for such events and say they only guarantee the frequency of the mains power (50Hz) not the voltage!

The last entry in the old UPS log was an over voltage alert showing 1000v!

With a new UPS in place and online, we’ve already had a number of power cuts and it’s handled them well, lets hope we don’t get another big one!

Backups are now running again on external drives that are disconnected when not in use to protect them from power surges and all the services successfully migrated over to the replacement server.

More soon ….