Microsoft using manufactured DNA-based data storage


Existing data storage methods can’t keep up with the amount of data we need to store, so Microsoft is creating a fully automated data-to-DNA storage system as a solution.

Microsoft says we’re soon going to be faced with a bit of a data storage crisis. We are producing so much data that we’re going to reach a point where there’s more data than storage available. To solve this impending problem, Microsoft is turning to DNA. 

According to Seagate, in 2018 we created 33 zettabytes of data, but by 2025 that will have grown to 175 zettabytes. Hard drives continue to grow in capacity, but even if they did manage to keep up with demand they require a lot of physical storage space and cooling, meaning datacenters will need to expand. DNA, on the other hand, can store data “in a space that’s orders of magnitude smaller than datacenters,” but we need to figure out how to automate the data-to-DNA process and to do so cheaply.

A team of researchers at Microsoft working with the University of Washington believe they have taken the first step towards doing just that. A proof-of-concept test successfully demonstrated “the first fully automated system to store and retrieve data in manufactured DNA.”

Karen Strauss, principal researcher at Microsoft Research, explained, “Our ultimate goal is to put a system into production that, to the end user, looks very much like any other cloud storage service – bits are sent to a datacenter and stored there and then they just appear when the customer wants them. To do that, we needed to prove that this is practical from an automation perspective.” 

The system works by taking the ones and zeros used to store digital data and converting it into As, Ts, Cs, and Gs, which are the building blocks of DNA. A synthesizer is then used to turn the converted data into DNA strands for storage. The process requires off-the-shelf lab equipment, liquids, and chemicals. In reverse, the DNA liquid is moved using a microfluidic pump so as to be “read” and converted back to ones and zeros. 

The proof-of-concept test managed to encode “hello” as DNA and then convert it back, but using a fully automated setup. What does that mean for data storage when it’s scaled up? The data stored in a warehouse-sized datacenter today would fit into “a space roughly the size of a few board game dice.”

With a successful test completed, one of the next challenges is developing a way to search using DNA molecules.