PyDamage: automated ancient damage identification and estimation for contigs in ancient DNA de novo assembly

被引:23
作者
Borry, Maxime [1 ]
Huebner, Alexander [1 ,2 ]
Rohrlach, Adam B. [3 ,4 ]
Warinner, Christina [1 ,2 ,5 ]
机构
[1] Max Planck Inst Sci Human Hist, Dept Archaeogenet, Microbiome Sci Grp, Jena, Germany
[2] Friedrich Schiller Univ Jena, Fac Biol Sci, Jena, Germany
[3] Max Planck Inst Sci Human Hist, Dept Archaeogenet, Populat Genet Grp, Jena, Germany
[4] Univ Adelaide, ARC Ctr Excellence Math & Stat Frontiers, Adelaide, SA, Australia
[5] Harvard Univ, Dept Anthropol, Cambridge, MA 02138 USA
来源
PEERJ | 2021年 / 9卷
基金
欧洲研究理事会;
关键词
metagenomics; aDNA; ancient DNA; assembly; damage; de novo; automated; READ ALIGNMENT; RESISTANCE; SEQUENCES; EVOLUTION; PATTERNS;
D O I
10.7717/peerj.11845
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
DNA de novo assembly can be used to reconstruct longer stretches of DNA (contigs), including genes and even genomes, from short DNA sequencing reads. Applying this technique to metagenomic data derived from archaeological remains, such as paleofeces and dental calculus, we can investigate past microbiome functional diversity that may be absent or underrepresented in the modern microbiome gene catalogue. However, compared to modern samples, ancient samples are often burdened with environmental contamination, resulting in metagenomic datasets that represent mixtures of ancient and modern DNA. The ability to rapidly and reliably establish the authenticity and integrity of ancient samples is essential for ancient DNA studies, and the ability to distinguish between ancient and modern sequences is particularly important for ancient microbiome studies. Characteristic patterns of ancient DNA damage, namely DNA fragmentation and cytosine deamination (observed as C-to-T transitions) are typically used to authenticate ancient samples and sequences, but existing tools for inspecting and filtering aDNA damage either compute it at the read level, which leads to high data loss and lower quality when used in combination with de novo assembly, or require manual inspection, which is impractical for ancient assemblies that typically contain tens to hundreds of thousands of contigs. To address these challenges, we designed PyDamage, a robust, automated approach for aDNA damage estimation and authentication of de novo assembled aDNA. PyDamage uses a likelihood ratio based approach to discriminate between truly ancient contigs and contigs originating from modern contamination. We test PyDamage on both on simulated aDNA data and archaeological paleofeces, and we demonstrate its ability to reliably and automatically identify contigs bearing DNA damage characteristic of aDNA. Coupled with aDNA de novo assembly, Pydamage opens up new doors to explore functional diversity in ancient metagenomic datasets.
引用
收藏
页数:22
相关论文
共 56 条
[1]  
Angelakis E., 2019, New Microbes and New Infections, V27, P14, DOI 10.1016/j.nmni.2018.10.009
[2]   What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models [J].
Babyak, MA .
PSYCHOSOMATIC MEDICINE, 2004, 66 (03) :411-421
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]   A genetic analysis of the Gibraltar Neanderthals [J].
Bokelmann, Lukas ;
Hajdinjak, Mateja ;
Peyregne, Stephane ;
Brace, Selina ;
Essel, Elena ;
de Filippo, Cesare ;
Glocke, Isabelle ;
Grote, Steffi ;
Mafessoni, Fabrizio ;
Nagel, Sarah ;
Kelso, Janet ;
Pruefer, Kay ;
Vernot, Benjamin ;
Barnes, Ian ;
Paabo, Svante ;
Meyer, Matthias ;
Stringer, Chris .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2019, 116 (31) :15610-15615
[5]   A subspace, interior, and conjugate gradient method for large-scale bound-constrained minimization problems [J].
Branch, MA ;
Coleman, TF ;
Li, YY .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1999, 21 (01) :1-23
[6]   Dental Calculus as a Tool to Study the Evolution of the Mammalian Oral Microbiome [J].
Brealey, Jaelle C. ;
Leitao, Henrique G. ;
van der Valk, Tom ;
Xu, Wenbo ;
Bougiouri, Katia ;
Dalen, Love ;
Guschanski, Katerina .
MOLECULAR BIOLOGY AND EVOLUTION, 2020, 37 (10) :3003-3022
[7]  
Breitwieser F.P., 2016, Bioinformatics, P084715, DOI DOI 10.1101/084715
[8]   Patterns of damage in genomic DNA sequences from a Neandertal [J].
Briggs, Adrian W. ;
Stenzel, Udo ;
Johnson, Philip L. F. ;
Green, Richard E. ;
Kelso, Janet ;
Pruefer, Kay ;
Meyer, Matthias ;
Krause, Johannes ;
Ronan, Michael T. ;
Lachmann, Michael ;
Paeaebo, Svante .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (37) :14616-14621
[9]  
Chan M., 2020, rwa: Perform a Relative Weights Analysis
[10]   Antimicrobial Resistance in Bacteria: Mechanisms, Evolution, and Persistence [J].
Christaki, Eirini ;
Marcou, Markella ;
Tofarides, Andreas .
JOURNAL OF MOLECULAR EVOLUTION, 2020, 88 (01) :26-40