How to Address Your Audience during a Hiccup
July 2nd, 2008 | Filed under: Business
This past year has been a rather busy year for hosted services having major issues due to massive failures at data center facilities. Places like The Planet have been affected, and most recently Autotask had an outage. This was of the most interest to me as I am an Autotask customer, and wanted to take some time to comment on the e-mail that was sent out by Adam Stewart this afternoon.
This morning, some Autotask customers experienced an outage. We realize that your business absolutely relies on Autotask and that any outage can be potentially devastating. We have invested significantly in systems and processes to ensure the highest possible uptime. This morning, one of the partners on which we rely failed us and some of you lost service for a short time.
The co-location facility that houses our datacenter experienced a catastrophic power circuit failure early this morning. While the facility provides redundant power (generator with on site storage of 7 days worth of fuel), this failure disconnected the generator from the rest of the facility making it useless until power was manually rerouted. Because we planned for such an emergency, we have local UPS units at each of our racks. These immediately kicked in and sustained the load for 30 to 45 minutes. They then gracefully shut down certain servers to ensure no loss of data, just as they should have.
Power was quickly restored when the on-call tech arrived and we gradually brought up the affected servers again to ensure no data loss and a smooth restoration of functionality.
While this type of incident is rare, we plan to increase the amount of local UPS capacity in our racks, to provide additional redundancy directly within our control.
For the first half of 2008, the main Autotask service has had no outages other than scheduled maintenance where we work to increase our fault tolerance and redundancy. We will continue to invest in our infrastructure for the remainder of 2008 to ensure the highest uptime possible.
I would like to close by personally apologizing for any inconvenience this outage may have caused you. If you need any further information, please feel free to contact me directly.
First I have to thank Autotask for being so outright and frank about what the problems were. Second, they did an amicable job of specifying what their plans are for mitigating this issue in the future. Finally Adam closed with asking that anyone that has any further questions to please contact him directly. This is by far and away one of the best examples of a problem response by a hosted service I’ve seen over this past year. If you compare the response to that of The Planet and even Dreamhost’s monstrous billing errors, I have to say that Autotask has my complete vote of confidence with how they handled this matter. No one is perfect, and even the best planned network systems can fail (especially if a truck drives into a transformer).
Thank you Autotask, and I look forward to your services becoming even better and more reliable due to this outage.