A Characterization of the DNA Data Storage Channel

被引:204
作者
Heckel, Reinhard [1 ]
Mikutis, Gediminas [2 ]
Grass, Robert N. [2 ]
机构
[1] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77005 USA
[2] Swiss Fed Inst Technol, Dept Chem & Appl Biosci, CH-8093 Zurich, Switzerland
关键词
DIGITAL INFORMATION; DEPURINATION; MICROARRAYS; CELL;
D O I
10.1038/s41598-019-45832-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Owing to its longevity and enormous information density, DNA, the molecule encoding biological information, has emerged as a promising archival storage medium. However, due to technological constraints, data can only be written onto many short DNA molecules that are stored in an unordered way, and can only be read by sampling from this DNA pool. Moreover, imperfections in writing (synthesis), reading (sequencing), storage, and handling of the DNA, in particular amplification via PCR, lead to a loss of DNA molecules and induce errors within the molecules. In order to design DNA storage systems, a qualitative and quantitative understanding of the errors and the loss of molecules is crucial. In this paper, we characterize those error probabilities by analyzing data from our own experiments as well as from experiments of two different groups. We find that errors within molecules are mainly due to synthesis and sequencing, while imperfections in handling and storage lead to a significant loss of sequences. The aim of our study is to help guide the design of future DNA data storage systems by providing a quantitative and qualitative understanding of the DNA data storage channel.
引用
收藏
页数:12
相关论文
共 40 条
[1]   Efficiency, Error and Yield in Light-Directed Maskless Synthesis of DNA Microarrays [J].
Agbavwe, Christy ;
Kim, Changhan ;
Hong, DongGee ;
Heinrich, Kurt ;
Wang, Tao ;
Somoza, Mark M. .
JOURNAL OF NANOBIOTECHNOLOGY, 2011, 9
[2]  
Allentoft M. E, 2012, P ROYAL SOC LOND B
[3]   BUILDING AN ASSOCIATIVE MEMORY VASTLY LARGER THAN THE BRAIN [J].
BAUM, EB .
SCIENCE, 1995, 268 (5210) :583-585
[4]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[5]  
Bornhol J, 2016, ACM SIGPLAN NOTICES, V51, P637, DOI [10.1145/2954679.2872397, 10.1145/2872362.2872397]
[6]   A quantitative RT-PCR platform for high-throughput expression profiling of 2500 rice transcription factors [J].
Caldana, Camila ;
Scheible, Wolf-Ruediger ;
Mueller-Roeber, Bernd ;
Ruzicic, Slobodan .
PLANT METHODS, 2007, 3 (1)
[7]   Next-Generation Digital Information Storage in DNA [J].
Church, George M. ;
Gao, Yuan ;
Kosuri, Sriram .
SCIENCE, 2012, 337 (6102) :1628-1628
[8]   PCR fidelity of Pfu DNA polymerase and other thermostable DNA polymerases [J].
Cline, J ;
Braman, JC ;
Hogrefe, HH .
NUCLEIC ACIDS RESEARCH, 1996, 24 (18) :3546-3551
[9]  
Erlich Y, 2017, SCI
[10]   Alta-Cyclic: a selfoptimizing base caller for next-generation sequencing [J].
Erlich, Yaniv ;
Mitra, Partha P. ;
delaBastide, Melissa ;
McCombie, W. Richard ;
Hannon, Gregory J. .
NATURE METHODS, 2008, 5 (08) :679-682