June 25, 2005
FeedBurner Outage
While we have worked hard this past year on a robust network design to prevent any single source of failure from bringing down the system, we are currently hosted in one network center at Equinix. Some time after Midnight EST Saturday, Equinix lost power due to a fire in one of the city's central power stations. Though we host in an Equinix facility that boasts redundant power supplies, those backup supplies failed, causing FeedBurner, Morningstar, and a number of other major sites hosted in Chicago to lose power and go offline (after our own backup power supplies ran out of juice ). Power was restored to our systems at 4:10am EST, and our network then came back online. We then ran a series of database diagnostics and utlities to ensure data integrity.
We have been designing a geographic redundancy plan that we will now dramatically accelerate. None of our publishers care to know whose fault this is, they just want a working service. Obviously, we are extremely frustrated that paying for Tier 1 service did not result in Tier 1 service in that rare instance when it was needed.
In a cruel twist, since we are actually moving offices on Monday, we had just moved our blog from our office servers to our central hosting facility two days ago, so that the blog wouldn't be down when we moved. As such, we weren't able to post this until our central facility regained power. Again, our response to this will be to get much more aggressive about investing in geographic redundancy as quickly as possible.
We realize there are thousands of European and Asian publishers who experienced this issue mid morning or mid afternoon on Saturday. We will respond to all emails individually, and we appreciate your feedback.
Comments
Being European I noticed your down-time this morning, when checking my feeds.
Thanks for being so honest about it here on the blog, have a good time moving into new offices and good luck with the redundancy planning.
"those backup supplies failed"
I've never heard anybody say, those backup supplies kicked in and saved the day. They always fail. Early on at 724, we twice lost major pieces of Source Safe, only to find the backups were blank. We didn't even learn after the first incident. Do backups (power or tape) ever work?
How dare you! I'm going to sue you for every penny you have!
(British humour, don't worry; I won't!)
I had forgotten you were located here in Chicago. I'd also forgotten ComED could be so problematic.
Yes I notice the outage in my logs. It looks like Ping-o-matic is also down for the count.
Best regards
Toby Mack Production
Home of the hit comedy series:
Jonathan Boss Secret Agent 14 and a Half
http://feeds.feedburner.com/FlashItUpBlog
The datacenter our servers are located lost power and of course the huge array of batteries (I think it could run something like 20 hours at current capacity since not many people are in there) failed. Fortunately everything was fine. As for the actual backups, I actually take the time every 2 weeks to make sure the backups actually work just be safe.
I wonder if this is the same problem that has hit Ping0Matic..down since Sunday
The best thing you could do is ask the network provider to pull the plug on your server once a month and this way verify that the backup power is still working. We actually do that at the place I work. Once a month, the main power lines that feed the big data centers are cut. This way they KNOW that emergency power works (and have a good reason to keep it in perfect shape)
Thanks for being open about this though.
Greetings from Germany
Chris
