Data storage technologies are having a hard time keeping up, as data in the world is doubling every two years, according to a 2014 estimate by EMC. As a result, researchers are looking at various methods to store data as a possible storage medium.
Recently, researchers Yaniv Erlich and Dina Zielinski of the Data Science Institute at Columbia University and the New York Genome Center (NYGC) unveiled a new technique that allows DNA to store more data than ever before. In nature, DNA works by storing information about different forms of life and its characteristics using four base nucleotides: A, G, C and T. DNA has been studied for a while as a possible solution for storing human-generated data.
In essence, DNA works just like your hard drive, but instead of binary ones and zeros to store digital data, it uses a quaternary base to store information about a living organism’s genes. DNA is an ideal storage medium because it is ultra-compact and can last hundreds of thousands of years if kept in a cool, dry place, as demonstrated by the recent recovery of DNA from the bones of a 430,000-year-old human ancestor found in a cave in Spain.
“DNA won’t degrade over time like cassette tapes and CDs, and it won’t become obsolete – if it does, we have bigger problems,” said Yaniv Erlich from Columbia University.
The researchers showed how an algorithm designed for streaming video on a cellphone can unlock DNA’s nearly full storage potential by squeezing more information into its four base nucleotides. During their experiment, researchers said they successfully stored six files inside DNA molecules — a full computer operating system (KolibriOS), a 1895 French film – “Arrival of a train at La Ciotat”, a $50 Amazon gift card, a computer virus, a Pioneer plaque, and a 1948 study by information theorist Claude Shannon—into 72,000 DNA strands, each 200 bases long.
After this, they retrieved the data using DNA sequencing technology and then a software to translate the code back into binary form so that it becomes readable again. The files were recovered with no errors.
“To retrieve the information, we sequenced the molecules. This is the basic process,” Erlich said.
Erlich explained how DNA is a better option than the current ones we already have. “DNA has several advantages to store information,” he said. “The first thing is that it’s very compact. In effect, it’s about one million times more compact than what you can get when you use a regular digital media.”
The storage capacity is massive; it can reach a density of 215 Petabytes per gram of DNA and can last a very long period of time, which can be over a 100 years.
“We believe this is the highest-density data storage device ever created,” said Erlich.
The main barrier at the moment of bringing this into commercialisation is time and money, as it takes about two weeks to synthesize the DNA sequence, while it costs $7,000 to sequence 2MB of data into DNA, and then another $2,000 to read it.
Despite this, the research team is very optimistic. When questioned how long it would take for this technology to be made available to everyone, Erlich replied that, “I would guess more than a decade. We are still in early days, but it also took magnetic media years of research and development before it became useful.”
The research has been published in the