The unreliability of Amazon Web Services is causing some cloud-dependent firms to reconsider their options and move their infrastructure onto other platforms.
Problems with the firm's Elastic Block Store (EBS) service occurred at an Amazon datacentre in Virginia on 17 and 18 March. The issues were fixed relatively quickly, but still had a significant impact.
"From 7:28pm PDT to 9:56pm PDT a networking issue affected connectivity to a significant number of instances in the US-EAST-1 region. Affected instances experienced degraded network connectivity to the internet and to instances in other availability zones," said the firm on its Service Health Dashboard.
"The root cause of last night's issue was when a core network routing device experienced a partial failure. While the router was causing packet loss, the failure was not detected by surrounding network devices and therefore they did not automatically fail traffic over to redundant network paths as intended."
However, while the problems were apparently solved relatively easily in-house, the same cannot be said for companies that were using the services.
Heruko, a Ruby-based cloud platform-as-a-service provider, said that it was still experiencing problems on Friday.
"Network connectivity has improved substantially, but we are still seeing brief periods of instability as additional networking changes are applied to mitigate the problems. We will continue to provide updates as soon as we have anything to share," the company wrote on 18 March.
Reddit, the news and link-sharing site, also suffered from the outage and was, like Heruko, forced to use its blog to explain the problem to customers.
"As most of you are probably aware, we had some serious downtime with the site today," wrote the firm.
"As you will see, the blame was partly ours and partly Amazon's (our hosting provider). But you probably don't care who is to blame, and we aren't here to assign blame. We just want to tell you what happened."
Reddit said that, despite Amazon's relatively swift reaction to the problems, and the usefulness of EBS, the company had decided to move some of its systems away from the Amazon cloud server offering.
"Even before the serious outage last night, we suffered random disks degrading multiple times a week. While we do have protections in place to mitigate latency on a small set of disks by using raid-0 stripes, the frequency of degradation has become highly unpalatable," Reddit said.
"Over the course of the past few weeks, we have been working to completely move Cassandra [servers] off EBS and onto the local storage which is directly attached to the EC2 instances."
In a move which could be mirrored by other companies affected by the outage, Reddit argued that, although local storage has much less functionality than EBS, its reliability outweighs the benefits of EBS.
An ex-Reddit employee posting on the firm's discussion board on Friday was more outspoken, claiming that EBS alone accounts for more than 80 per cent of Reddit's downtime.
"Amazon EBS is a barrel of laughs in terms of performance and reliability and a constant (and the single largest) source of failure across Reddit," he wrote.
"Reddit's been in talks with Amazon all the way up to CIOs about ways to fix them for nearly a year and they've constantly been making promises that they haven't been keeping, passing us to new people that 'will finally be able to fix it', and variously otherwise desperately trying to keep Reddit while not actually earning it."
Facebook told by Brussels-based court to stop tracking non-users and to delete all data held on them
Supply chain and manufacturing experience could give Dyson an important edge
New VR Zone Portal arcades open in London and Tunbridge Wells
Systems-on-a-chip with integrated AI features could make voice and facial recognition