Probing the physical limits of reliable DNA data retrieval

被引:90
作者
Organick, Lee [1 ]
Chen, Yuan-Jyue [2 ]
Ang, Siena Dumas [2 ]
Lopez, Randolph [3 ]
Liu, Xiaomeng [1 ]
Strauss, Karin [2 ]
Ceze, Luis [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
[2] Microsoft, Redmond, WA 98052 USA
[3] Univ Washington, Dept Bioengn, Seattle, WA 98195 USA
关键词
DIGITAL INFORMATION; STORAGE; ROBUST;
D O I
10.1038/s41467-020-14319-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Synthetic DNA is gaining momentum as a potential storage medium for archival data storage. In this process, digital information is translated into sequences of nucleotides and the resulting synthetic DNA strands are then stored for later retrieval. Here, we demonstrate reliable file recovery with PCR-based random access when as few as ten copies per sequence are stored, on average. This results in density of about 17 exabytes/gram, nearly two orders of magnitude greater than prior work has shown. We successfully retrieve the same data in a complex pool of over 10(10) unique sequences per microliter with no evidence that we have begun to approach complexity limits. Finally, we also investigate the effects of file size and sequencing coverage on successful file retrieval and look for systematic DNA strand drop out. These findings substantiate the robustness and high data density of the process examined here. The physical limits and reliability of PCR-based random access of DNA encoded data is unknown. Here the authors demonstrate reliable file recovery from as few as ten copies per sequence, providing a data density limit of 17 exabytes per gram.
引用
收藏
页数:7
相关论文
共 19 条
[1]   Forward Error Correction for DNA Data Storage [J].
Blawat, Meinolf ;
Gaedke, Klaus ;
Huetter, Ingo ;
Chen, Xiao-Ming ;
Turczyk, Brian ;
Inverso, Samuel ;
Pruitt, Benjamin W. ;
Church, George M. .
INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 2016, 80 :1011-1022
[2]  
Bornholt J, 2016, P ASPLOS
[3]  
Chen Y.-J, 2019, PREPRINT, DOI [10.1101/566554v1, DOI 10.1101/566554V1]
[4]   Next-Generation Digital Information Storage in DNA [J].
Church, George M. ;
Gao, Yuan ;
Kosuri, Sriram .
SCIENCE, 2012, 337 (6102) :1628-1628
[5]   DNA Fountain enables a robust and efficient storage architecture [J].
Erlich, Yaniv ;
Zielinski, Dina .
SCIENCE, 2017, 355 (6328) :950-953
[6]   The NCBI BioSystems database [J].
Geer, Lewis Y. ;
Marchler-Bauer, Aron ;
Geer, Renata C. ;
Han, Lianyi ;
He, Jane ;
He, Siqian ;
Liu, Chunlei ;
Shi, Wenyao ;
Bryant, Stephen H. .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D492-D496
[7]   Towards practical, high-capacity, low-maintenance information storage in synthesized DNA [J].
Goldman, Nick ;
Bertone, Paul ;
Chen, Siyuan ;
Dessimoz, Christophe ;
LeProust, Emily M. ;
Sipos, Botond ;
Birney, Ewan .
NATURE, 2013, 494 (7435) :77-80
[8]   DrImpute: imputing dropout events in single cell RNA sequencing data [J].
Gong, Wuming ;
Kwak, Il-Youp ;
Pota, Pruthvi ;
Koyano-Nakagawa, Naoko ;
Garry, Daniel J. .
BMC BIOINFORMATICS, 2018, 19
[9]   Robust Chemical Preservation of Digital Information on DNA in Silica with Error-Correcting Codes [J].
Grass, Robert N. ;
Heckel, Reinhard ;
Puddu, Michela ;
Paunescu, Daniela ;
Stark, Wendelin J. .
ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2015, 54 (08) :2552-2555
[10]  
Kharchenko PV, 2014, NAT METHODS, V11, P740, DOI [10.1038/NMETH.2967, 10.1038/nmeth.2967]