Cool new ways to store and retrieve data in DNA
Researchers are developing ways to store data in DNA without DNA synthesis.

The world today generates mind-bogglingly insane amounts of data. Think of all of us on the internet, science experiments with a gazillion digital files, and connected devices that constantly talk to each other. And as we generate more data every year than the year before, we could run out of both space and energy for data centres to store this data.
Storing data in DNA, in principle, is an incredible alternative. DNA lasts a long time — intact DNA has been found in fossils over a million years old — and replicates information over multiple generations. It’s also incredibly information-dense. Every gram of it can contain far more bits than conventional data storage media.
Earlier this month, I attended a conference in Prague that looked at the promises and challenges of this technology. It featured some fantastic talks covering different aspects of DNA-based data storage. Here are three of them that I'm keen to follow as they play out.
Storing data in more than the sequence
DNA has a four-letter alphabet that can be converted back and forth into the binary code. Storing data in DNA typically involves figuring out what DNA sequence would encode for that data and then synthesizing it. But DNA synthesis is slow and expensive. While researchers are exploring ways to speed that up, some are figuring out ways to store data in DNA without synthesis. Naturally, these methods store information in aspects of DNA other than its sequence.
One approach is a throwback to punch cards that stored information in the presence or absence of holes. Likewise, information could be stored in the presence or absence of nicks made with gene editing technologies at particular sites on the DNA. In another approach, data is stored in bumps on DNA nanostructures, with their presence or absence at particular sites equating to ones and zeroes.
Although these methods are not as information-dense as storing data in a DNA sequence, they could improve the utility of DNA data storage. Nicks can be easily resealed with enzymes that stitch together DNA, making it a rewritable way to store data in DNA. On the other hand, "reading" bumps doesn’t require DNA sequencing, as they can be seen under a microscope.
Retrieving data from DNA pools
Along with writing DNA in data, retrieving data is a bottleneck to DNA data storage, too, despite DNA sequencing being much cheaper and faster. Retrieving a particular file shouldn't require sequencing the entire library, but that's how it's often done. To overcome this limitation, researchers are enabling random access, or being able to retrieve a particular piece of information without having to go through everything, for DNA data storage.
To do this, different files are typically stored on different DNA sequences, each with a unique identifier. When a particular file needs to be read, the identifier holds the key to sequencing just that particular DNA sequence. Sequencing for retrieval introduces errors, and taking DNA out for sequencing means it also leads to a loss of data. Imagine you could copy data from a hard disk only a certain number of times before it’s lost completely. To avoid these, researchers are developing techniques that find particular sequences in a pool faster and access them without depleting them.
Others are working on creative ways to store DNA that make information retrieval easier and, simultaneously, improve its preservation. For example, researchers from CIC nanoGUNE are encapsulating DNA in two crosslinked polymer nanofibers. One polymer holds onto the DNA incredibly tightly, and the other lets it go when information needs to be read.
Doing more with little DNA
Researchers stitching DNA into polymers told me that, in the future, they could store DNA in textiles that could be turned into clothes. This is the premise behind the DNA of things. While DNA data storage is often thought of as a solution to sustainably store copious amounts of data, interesting things can also be done with storing little data in DNA.
An important area of research focus is traceability. Tamper-proof packaging tackles the problem of counterfeit drugs, but only if the counterfeiters cannot hack the packaging. Tagging the product itself with an unclonable signature is more secure. To this end, researchers are developing DNA-based authentication strategies that work with encrypting information in DNA.
In another example of storing information beyond the sequence, instructions could be coded in the shape of a DNA origami molecule. This could pave the way for DNA-based responsive materials that interact with their environment.
Although DNA data storage is in its infancy, the technologies enabling it are witnessing massive improvements. DNA synthesis, DNA nanotechnology, and gene editing are advancing dozens of other applications across biotech and medicine. It'll be interesting to see how those insights are brought back to the problem of sustainable, long-term data storage.