Amazon Web Services is still working on problems which hit its Elastic Compute Cloud (EC2) and Elastic Block Storage (EBS) services over two days ago, causing increased error rates and connectivity problems which took down major Web 2.0 sites reliant on the infrastructure such as Quora and Reddit.
The connectivity errors affecting EC2 instances and latencies affecting EBS volumes were first noted on Thursday morning as hitting multiple areas in Amazon's US-EAST-1 region.
"A networking event early this morning triggered a large amount of re-mirroring of EBS volumes in US-EAST-1. This re-mirroring created a shortage of capacity in one of the US-EAST-1 Availability Zones, which impacted new EBS volume creation as well as the pace with which we could re-mirror and recover affected EBS volumes," noted a posting on Amazon's Service Health Dashboard site.
"Additionally, one of our internal control planes for EBS has become inundated such that it's difficult to create new EBS volumes and EBS backed instances. We are working as quickly as possible to add capacity to that one Availability Zone to speed up the re-mirroring, and working to restore the control plane issue."
However, as of 10am BST, the firm said it was still working on unblocking the bottleneck in the system which is slowing down work to re-establish connections between volumes and their instances.
AWS also said as of Saturday morning that it was still working on restoring access to remaining Relational Database Service instances which have also been affected.
Sites such as Quora, HootSuite, Foursquare and Reddit now appear to be working as usual after earlier outages, but the incident will be a timely reminder of the risks involved in outsourcing any computing infrastructure to the cloud.
The best Black Friday tech bargains out there
Russell Group slammed for misusing student data in donation campaigns
Linus Torvalds is unhappy with current approaches to Linux security
Bug prevents ASLR from randomising location of important data