Home > Tech > Thoughts on the Amazon outage

Thoughts on the Amazon outage

Disaster Recovery needs to be a primary objective when planing and implementing any IT project, outsourced or not. The ‘Cloud’ isn’t magic, the ‘Cloud’ isn’t fail-proof, the ‘Cloud’ requires hardware, software, networking, security, support and execution – just like anything else.

All the fancy marketing speak, recommendations and free trials, can’t replace the need to do obsessive due diligence before trusting any provider no matter how big and awesome they may seem or what their marketing department promise.

Why do Data Centers have UPS and Diesel Generators on-site? They know electricity can and does fail.

Why do we buy servers will dual power supplies? We know they can and do fail.

Why do we implement RAID? We know hard drives can and do fail.

Prepare for the worst, period.

Putting all of your eggs in one cloud, so to speak, no matter how much redundancy they say they have seems to be short-sighted in my opinion. If you are utilizing an MSP, HSP, CSP, IAAS, SAAS, PAAS, et all to attract/increase/fulfill a large percentage of your revenue or all of your revenue like many companies are doing nowadays then you need to assume that all vendors will eventually have an issue like this that affects your overall uptime, brand and churn rate. A blip here and there is tolerable.

Amazon’s downtime is stratospherically high, and their prices are spectacularly inflated. Their ping times are terrible and they offer little that anyone else doesn’t offer. Anyone holding them up as a good solution without an explanation has no idea what they’re talking about.

The same hosting platform, as always, is preferred: dedicated boxes at geographically disparate and redundant locations, managed by different companies. That way when host 1 shits the bed, hosts 2 and 3 keep churning.

Nobody who has even a rudimentary best-practice hosting setup has been affected by the Amazon outage in any way other than a speed hit as their resources shift to a secondary center.

Stop following the new-media goons around. They don’t know what they’re doing. There’s a reason they’re down twice a month and making excuses.

Personally, I do not use a server for “mission critical” applications that I cannot physically kick. Failing that, a knowledgeable SysAdmin that I can kick.

  • You know of a knowledgeable SysAdmin?!?

  • I know of one. Never met him though!

  • I’m surprised the word “fluff” didn’t appear in there at least once :)

  • Paul

    I agree. I also noticed that two of our users (located in China and Greece) couldn’t access our AWS servers a few weeks back. It wasn’t a DNS issue and the servers were up. It appeared to a network routing issue to its IP. They had no problem routing to another server we have in Germany.

  • Rob Hutten

     I’m blushing, Nick.

  • You have made some decent points there. I looked on the internet to find out more about the issue and found most
    individuals will go along with your views on this web site.