A film, a computer operating system, a virus AND an Amazon gift card are all stored on a single piece of DNA

  • Scientists have used DNA to create the most effective storage space ever 
  • The team believe they could compress all the world's data into a single room
  • This age-old storage solution could help us preserve knowledge forever
  • Researchers have been able to pack 215 petabytes of data on a single gram of DNA - 100 times more than previous efforts

Humanity is currently producing data at exponential rates and we need devices to store it all. 

Scientists are solving this modern problem with nature’s age-old solution for information-storage - DNA.  

Using this amazing technique the team believe we could compress all the world's data into a single room. 

Now researchers have been able to pack 215 petabytes of data on a single gram of DNA - 100 times more than previous efforts. 

Scroll down for video

Researchers from New York Genome Centre and Columbia University have used DNA to create the highest-density data-storage device ever. DNA is an ideal storage medium because it's ultra-compact and can last hundreds of thousands of years if kept in a cool, dry place

Researchers from New York Genome Centre and Columbia University have used DNA to create the highest-density data-storage device ever. DNA is an ideal storage medium because it's ultra-compact and can last hundreds of thousands of years if kept in a cool, dry place

HOW DOES IT WORK?

The researchers began with a method that converts the long strings of ones and zeros in digital data into the four basic blocks of DNA sequences – adenine, guanine, cytosine and thymine. 

The digital data was then chopped into pieces and stored by synthesizing a massive number of tiny DNA molecules, which can be dehydrated and preserved for a long time.  

In all they generated a digital list of 72,000 DNA strands, each 200 bases long.

To retrieve their files, they used modern sequencing technology to read the DNA strands, followed by software to translate the genetic code back into binary.

They recovered their files with zero errors. 

Researchers Yaniv Erlich and Dina Zielinski from New York Genome Centre and Columbia University created a algorithm, called DNA Fountain, which could provide large-capacity information storage.

Using the algorithm researchers squeezed six files into a single speck of DNA - a full computer operating system, an 1895 French film, 'Arrival of a train at La Ciotat,' a $50 Amazon gift card, a computer virus, a Pioneer plaque and a 1948 study by information theorist Claude Shannon.

The algorithm can unlock DNA’s nearly full storage potential by squeezing more information into its four base nucleotides.

The team believe they are approaching the theoretical maximum for data storage.   

'DNA won't degrade over time like cassette tapes and CDs, and it won't become obsolete – if it does, we have bigger problems,' said Yaniv Erlich from Columbia University. 

The result, according to Dr Erlich, is the 'highest-density data-storage device ever created.'

DNA is an ideal storage medium because it's ultra-compact and can last hundreds of thousands of years if kept in a cool, dry place. 

The technique could help us preserve knowledge forever - unlike current technology media which gets all kinds of faults with time, according to the paper published in the Science Mag for the American Association for the Advancement of Science.  

Dr Erlich and Zielinski compressed the files into a master file, and then split the data into short strings of binary code made up of ones and zeros.

Using an erasure-correcting algorithm DNA Fountain they randomly packaged the strings into so-called droplets, and mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T.

The algorithm deleted letter combinations known to create errors, and added a barcode to each droplet to help reassemble the files later.

Scientists chose chose six files to encode into DNA: A full computer operating system, an 1895 French film, 'Arrival of a train at La Ciotat,' a $50 Amazon gift card, a computer virus, a Pioneer plaque and a 1948 study by information theorist Claude Shannon

Scientists chose chose six files to encode into DNA: A full computer operating system, an 1895 French film, 'Arrival of a train at La Ciotat,' a $50 Amazon gift card, a computer virus, a Pioneer plaque and a 1948 study by information theorist Claude Shannon

In all they generated a digital list of 72,000 DNA strands, each 200 bases long.

To retrieve their files, they used modern sequencing technology to read the DNA strands, followed by software to translate the genetic code back into binary. 

They recovered their files with zero errors.

The researchers show that their coding strategy packs 215 petabytes of data on a single gram of DNA - 100 times more than previous efforts. 

Using an erasure-correcting algorithm DNA Fountain the researchers randomly packaged the strings into so-called droplets, and mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T. In all they generated a digital list of 72,000 DNA strands, each 200 bases long

Using an erasure-correcting algorithm DNA Fountain the researchers randomly packaged the strings into so-called droplets, and mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T. In all they generated a digital list of 72,000 DNA strands, each 200 bases long

Cost still remains a barrier. The researchers spent $7,000 (£5,700) to synthesize the DNA they used to archive their 2 megabytes of data, and another $2,000 (£1,600) to read it. 

Though the price of DNA sequencing has fallen exponentially, there may not be the same demand for DNA synthesis, says Sri Kosuri, a biochemistry professor at UCLA who was not involved in the study. 

'Investors may not be willing to risk tons of money to bring costs down,' he said.

But the price of DNA synthesis can be vastly reduced if lower-quality molecules are produced, and coding strategies like DNA Fountain are used to fix molecular errors, says Dr Erlich. 

'We can do more of the heavy lifting on the computer to take the burden off time-intensive molecular coding,' he said.

 

 

The comments below have not been moderated.

The views expressed in the contents above are those of our users and do not necessarily reflect the views of MailOnline.

We are no longer accepting comments on this article.