A project done at Harvard and published in the journal Science (also reported on by the Wall Street Journal) wherein scientists took a book and converted it into a digitized pool of DNA. The manuscript (in press) is on genetic engineering. It’s called “Regenesis” and was written by the project leader, Dr George Church. The book contains about 53k words, 11 pictures and totals 5.3 megs of data.
DNA is obviously an impressive data storage unit. It encodes all the information needed for a cell to grow into a living organism. That in itself is amazing. The structure of DNA is a log strand consisting of four unique building blocks. To make this more of a “binary code” (as is used in digital data storage), scientists simply treated viewed two of the building blocks as “1” and the other two as “0.”
Also fascinating – the DNA is not one single strand encoding the entire book in a linear fashion. The data are comprised of thousands of strands, each with both some data (ie letters or words) and also an ‘address’ indicating where this data fits in the complete puzzle. The entire collection of strands needs to be read and compiled, after which the data can be converted back into the book that had been encoded.
At present, this technology is too early and cumbersome to be of commercial value. But we innovate quickly. Back around the turn of the century, it took years to decode the first full human genome. Now there is the technology to decode an entire human genome in about a day.
The WSJ reports Dr Church as forecasting “A device the size of your thumb could store as much information as the whole internet.”
source: Written in (DNA) Code: Science 17 August 2012: 784. [DOI:10.1126/science.337.6096.784-a]. See also WSJ article