Tuesday 16 February 2010

Handling Power Loss - Cost vs Benefit

As part of my day-to-day life, I'm currently holding the president position in WiredSoc (the student computer science society in St. Andrews). It's an interesting and challenging position to hold.

One of the responsibilities I have (and delegate to some extent) is the reliable operation of WiredSoc's servers. Specifically it's me or the systems administrator that get complained to if something goes wrong.

This brings me onto a recent outage we had. At approximately 1am on Monday morning, services disappeared due to a brief power interruption/surge (I've got conflicting reports on that one). 2 of out 3 severs dropped out and came back online. Our public facing server came back online at about 9am and only after I manually kicked it into action.

Now an 8 hour outage on a voluntary service - is it acceptable? I would say yes. There is no way I'm going to force someone to go onsite at 2am to fix a problem when there is no compensation available. It's just beyond the bounds of acceptability! Really, would you volunteer to do it?

Also, a UPS (uninterrupted power supply, basically a battery used to keep computers ticking over in power loss) is out of the question - too costly and of little benefit. Realistically, power outages tend to take out our connectivity as well. So, we could remain powered up but of no use to anyone!

This leads me onto a similar problem I also saw with STAR's systems when I was in charge of those. Would it have been worth installing a UPS if we had the budget? Nope - power outages again took out the network and server. So, even if the studio was still live, we could not get the broadcast out (we were an online station).

In conclusion, it is my opinion that voluntary projects should strive for high availability. However, it should also be taken into consideration that it is not the end of the world if your service disappears temporarily.

No comments:

Post a Comment