Should the Amazon Web Services EC2 outage impact Cloud adoption?

Should the Amazon Web Services EC2 outage impact Cloud adoption?

By SecurityAsia Editors | May 4, 2011

On 21 April 2011, Amazon Web Services Elastic Compute Cloud (EC2) had an outage that impacted multiple Availability Zones. In a recent blog post, Trend Micro highlighted key learnings and what companies need to know while adopting the Cloud.

Amazon issued a status update indicating that the outage was based on problems with replication mirroring:

"This re-mirroring created a shortage of capacity in one of the US-EAST-1 Availability Zones, which impacted new EBS volume creation as well as the pace with which we could re-mirror and recover affected EBS volumes. Additionally, one of our internal control planes for EBS has become inundated such that it’s difficult to create new EBS volumes and EBS backed instances."

A certain amount of service outage is to be expected. However, this incident raises a couple of different concerns. One is that the Amazon Availability Zones did not work as represented. Amazon provides computing resources from different geographic reasons. In addition, each geographic location offers different Availability Zones which are supposed to be engineered to be insulated from failure in another Availability Zone. However, in the recent outage, multiple Availability Zones were impacted, showing that they are not acting as advertized.

In addition, this incident went beyond just an availability issue. Amazon was not able to recover all of the volumes affected. On April 25, Amazon issued the following in an update: “We’ve determined that a small number of volumes (0.07% of the volumes in our US-East Region) will not be fully recoverable.” This seems like a small amount, but 0.07% of what? Depending on the amount of Amazon’s overall services, this could be considerable. And if you’re one of the customers impacted, you don’t care how small the number is overall.

Should this outage with Amazon give cause for concern? Should it make businesses limit their cloud adoption? These are two separate questions. Yes, it should give cause for concern, but this should be a cautionary tale that influences how companies approach the cloud, not if they approach the cloud.

This was certainly a significant outage. However, Amazon has generally done a good job of service availability. And Amazon has built out their infrastructure with failover and load balancing beyond what most businesses are able to deploy in an on-premise data center. Although using an on-premise data center may give companies a feeling of more control, especially when an incident like this occurs, in truth they will most likely get better availability through a provider that is dedicated to offering on-demand computing services.

 
 

Add comment

Post a Comment

The content of this field is kept private and will not be shown publicly.
Verification Code
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
 

knowledge_central_tab

 
 
Knowledge Central
Trusted Mobility Index
The mobile ecosystem of devices, services and networks is at a critical inflection point.While the mobile revolution is unleashing massive opportunities in both emerging and mature economies, it is also increasing in complexity and confusion. The reality is the lightning-fast adoption of powerful, smart devices is outpacing society’s ability to secure them. Today, trust in mobility hangs in the balance.
The state of the Internet, Q4, 2011
Geography appears to play a role in frequency of observed attacks on specific ports. For example, Port 23 (Telnet) is a favorite target for attacks observed to be originating from South Korea and Turkey, where it accounted for more than five times the number of attacks targeting the next most popular port (445 in both countries). Other instances of geography-based port targeting include observed attacks centered on Port 1433 (Microsoft SQL Server) in China and on Port 80 (WWW/HTTP) in Indonesia.
 
 
 
HID Global deploys a centralized, web-based IP access control solution at Fuxi Power Plant
Unable to meet the needs for real-time monitoring with its traditional patrol system, China's Fuxi Power Plant has deployed HID Global's VertX V2000.
StubHub: How to spot fraud before it happens
Whenever a list of log-on credentials is dumped onto the Web, retailers get hit with waves of automated attacks. Here's how ticket marketplace StubHub fights the threat.