Assembly-free quantification of vagrant DNA inserts

被引:0
作者
Becher, Hannes [1 ]
Nichols, Richard A. [2 ]
机构
[1] Univ Edinburgh, Inst Genet & Canc, Edinburgh, Scotland
[2] Queen Mary Univ London, Sch Biol & Behav Sci, London, England
关键词
endosymbionts; genome skimming; nuclear pseudogenes; NUMTs; NUPTs; quantification; GRASSHOPPER PODISMA-PEDESTRIS; SEX-CHROMOSOME POLYMORPHISM; MITOCHONDRIAL-DNA; GENOME; NUMTS; PSEUDOGENES; RACES; CHLOROPLAST; SEQUENCES; ALIGNMENT;
D O I
10.1111/1755-0998.13764
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Inserts of DNA from extranuclear sources, such as organelles and microbes, are common in eukaryote nuclear genomes. However, sequence similarity between the nuclear and extranuclear DNA, and a history of multiple insertions, make the assembly of these regions challenging. Consequently, the number, sequence and location of these vagrant DNAs cannot be reliably inferred from the genome assemblies of most organisms. We introduce two statistical methods to estimate the abundance of nuclear inserts even in the absence of a nuclear genome assembly. The first (intercept method) only requires low-coverage (<1x) sequencing data, as commonly generated for population studies of organellar and ribosomal DNAs. The second method additionally requires that a subset of the individuals carry extranuclear DNA with diverged genotypes. We validated our intercept method using simulations and by re-estimating the frequency of human NUMTs (nuclear mitochondrial inserts). We then applied it to the grasshopper Podisma pedestris, exceptional for both its large genome size and reports of numerous NUMT inserts, estimating that NUMTs make up 0.056% of the nuclear genome, equivalent to >500 times the mitochondrial genome size. We also re-analysed a museomics data set of the parrot Psephotellus varius, obtaining an estimate of only 0.0043%, in line with reports from other species of bird. Our study demonstrates the utility of low-coverage high-throughput sequencing data for the quantification of nuclear vagrant DNAs. Beyond quantifying organellar inserts, these methods could also be used on endosymbiont-derived sequences. We provide an R implementation of our methods called "vagrantDNA" and code to simulate test data sets.
引用
收藏
页码:1002 / 1013
页数:12
相关论文
共 54 条
[11]   10KP: A phylodiverse genome sequencing plan [J].
Cheng, Shifeng ;
Melkonian, Michael ;
Smith, Stephen A. ;
Brockington, Samuel ;
Archibald, John M. ;
Delaux, Pierre-Marc ;
Li, Fay-Wei ;
Melkonian, Barbara ;
Mavrodiev, Evgeny V. ;
Sun, Wenjing ;
Fu, Yuan ;
Yang, Huanming ;
Soltis, Douglas E. ;
Graham, Sean W. ;
Soltis, Pamela S. ;
Liu, Xin ;
Xu, Xun ;
Wong, Gane Ka-Shu .
GIGASCIENCE, 2018, 7 (03)
[12]   NOVOPlasty: de novo assembly of organelle genomes from whole genome data [J].
Dierckxsens, Nicolas ;
Mardulyn, Patrick ;
Smits, Guillaume .
NUCLEIC ACIDS RESEARCH, 2017, 45 (04)
[13]   Genomic Repeat Abundances Contain Phylogenetic Signal [J].
Dodsworth, Steven ;
Chase, Mark W. ;
Kelly, Laura J. ;
Leitch, Ilia J. ;
Macas, Jiri ;
Novak, Petr ;
Piednoel, Mathieu ;
Weiss-Schneeweiss, Hanna ;
Leitch, Andrew R. .
SYSTEMATIC BIOLOGY, 2015, 64 (01) :112-126
[14]   Nuclear DNA content and genome size of trout and human [J].
Dolezel, J ;
Bartos, J ;
Voglmayr, H ;
Greilhuber, J .
CYTOMETRY PART A, 2003, 51A (02) :127-128
[15]   DNA barcoding of crickets, katydids and grasshoppers (Orthoptera) from Central Europe with focus on Austria, Germany and Switzerland [J].
Hawlitschek, O. ;
Moriniere, J. ;
Lehmann, G. U. C. ;
Lehmann, A. W. ;
Kropf, M. ;
Dunz, A. ;
Glaw, F. ;
Detcharoen, M. ;
Schmidt, S. ;
Hausmann, A. ;
Szucsich, N. U. ;
Caetano-Wyler, S. A. ;
Haszprunar, G. .
MOLECULAR ECOLOGY RESOURCES, 2017, 17 (05) :1037-1053
[16]   Molecular Poltergeists: Mitochondrial DNA Copies (numts) in Sequenced Nuclear Genomes [J].
Hazkani-Covo, Einat ;
Zeller, Raymond M. ;
Martin, William .
PLOS GENETICS, 2010, 6 (02)
[17]   INTERPOPULATION SEX-CHROMOSOME POLYMORPHISM IN GRASSHOPPER PODISMA-PEDESTRIS .2. POPULATION PARAMETERS [J].
HEWITT, GM ;
JOHN, B .
CHROMOSOMA, 1972, 37 (01) :23-&
[18]   Widespread lateral gene transfer from intracellular bacteria to multicellular eukaryotes [J].
Hotopp, Julie C. Dunning ;
Clark, Michael E. ;
Oliveira, Deodoro C. S. G. ;
Foster, Jeremy M. ;
Fischer, Peter ;
Munoz Torres, Monica C. ;
Giebel, Jonathan D. ;
Kumar, Nikhil ;
Ishmael, Nadeeza ;
Wang, Shiliang ;
Ingram, Jessica ;
Nene, Rahul V. ;
Shepard, Jessica ;
Tomkins, Jeffrey ;
Richards, Stephen ;
Spiro, David J. ;
Ghedin, Elodie ;
Slatko, Barton E. ;
Tettelin, Herve ;
Werren, John H. .
SCIENCE, 2007, 317 (5845) :1753-1756
[19]   GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes [J].
Jin, Jian-Jun ;
Yu, Wen-Bin ;
Yang, Jun-Bo ;
Song, Yu ;
dePamphilis, Claude W. ;
Yi, Ting-Shuang ;
Li, De-Zhu .
GENOME BIOLOGY, 2020, 21 (01)
[20]   INTER-POPULATION SEX CHROMOSOME POLYMORPHISM IN GRASSHOPPER PODISMA-PEDESTRIS .1. FUNDAMENTAL FACTS [J].
JOHN, B ;
HEWITT, GM .
CHROMOSOMA, 1970, 31 (03) :291-&