Dropbox has revealed it has migrated the majority of its storage capacity back from Amazon Web Services (AWS) to its own infrastructure, reversing the industry trend in order to gain tighter control over the performance of its service.
Dropbox officially launched back in 2008 as a cloud-based service that enabled users to store files online and have access to those files across multiple devices and operating systems.
The service made use of Amazon's cloud-based Simple Storage Service (Amazon S3) from the outset for storing the actual files, with user metadata and the Dropbox website hosted on the firm's own infrastructure.
However, Dropbox has spent the past few years slowly building up its own storage infrastructure and migrating the enormous volume of user files back onto this from AWS, with the result that it is now storing and serving over 90 percent of user data on its own custom-built infrastructure.
At first glance, Dropbox appears to be going against the industry trend of moving services to the cloud wherever possible in order to avoid having to procure and manage infrastructure to deliver applications and services, but Dropbox is not a typical user, as the firm's vice president of infrastructure, Akhil Gupta, explained in a blog posting.
"As the needs of our users and customers kept growing, we decided to invest seriously in building our own in-house storage system. There were a couple reasons behind this decision. First, one of our key product differentiators is performance. Bringing storage in-house allows us to customise the entire stack end-to-end and improve performance for our particular use case," Gupta said.
"Second, as one of the world's leading providers of cloud services, our use case for block storage is unique. We can leverage our scale and particular use case to customise both the hardware and software, resulting in better unit economics," he added.
In other words, because file storage is the firm's line of business, it makes sense for it to sink large sums of money into the infrastructure and software necessary to deliver the optimum experience for its customers.
However, Amazon Web Services "continues to be an invaluable partner - we couldn't have grown as fast as we did without a service like AWS," Gupta said.
Dropbox is keeping its cards close to its chest on many of the details of its new custom platform, but disclosed that it is using a technique similar to erasure coding in order to store files as redundant pieces of data distributed across different drives and even data centres.
"We encode aggregated extents of data in ‘volumes' that are placed on a random set of storage nodes (with sufficient physical diversity and various other constraints). Each storage node might hold a few thousand volumes, but the placement for each volume is independent of the others on the disk," wrote Dropbox Storage Team lead James Cowling, in response to a question on the Y Combinator site.
"If one disk fails we can thus reconstruct those volumes from hundreds of other disks simultaneously, unlike in RAID where you'd be limited in IOPS and network bandwidth to a fixed set of disks in the RAID array," he added.
Gupta said that Dropbox will continue to invest in its own infrastructure as well as partner with Amazon, wherever that makes the most sense for users.
"Later this year, we'll expand our relationship with AWS to store data in Germany for European business customers that request it. Protecting and preserving the data our users entrust us with is our top priority at all times," he said.
V3 is hosting a Cloud and Infrastructure Live summit on 20 and 21 April discussing numerous aspects of the cloud and how to best use it at your organisation. Sign up now to find out more.
A fast, gorgeous but expensive display
Intel wants to get inside your car, despite missing out on mobile
'We'll keep fighting to fight to keep the web free and open,' claim EFF
Breached in March by the same attackers, claim 'insiders'