Phylogenomics from low-coverage whole-genome sequencing

被引:107
作者
Zhang, Feng [1 ,2 ,3 ]
Ding, Yinhuan [1 ]
Zhu, Chao-Dong [2 ,4 ]
Zhou, Xin [5 ]
Orr, Michael C. [2 ]
Scheu, Stefan [3 ]
Luan, Yun-Xia [6 ]
机构
[1] Nanjing Agr Univ, Coll Plant Protect, Dept Entomol, Nanjing, Jiangsu, Peoples R China
[2] Chinese Acad Sci, Inst Zool, Key Lab Zool Systemat & Evolut, Beijing, Peoples R China
[3] Univ Gottingen, JF Blumenbach Inst Zool & Anthropol, Gottingen, Germany
[4] Univ Chinese Acad Sci, Coll Life Sci, Beijing, Peoples R China
[5] China Agr Univ, Dept Entomol, Beijing, Peoples R China
[6] South China Normal Univ, Sch Life Sci, Inst Insect Sci & Technol, Guangdong Prov Key Lab Insect Dev Biol & Appl Tec, Guangzhou, Guangdong, Peoples R China
来源
METHODS IN ECOLOGY AND EVOLUTION | 2019年 / 10卷 / 04期
基金
中国国家自然科学基金;
关键词
desktop PC; genome assembly; hybrid enrichment; single-copy orthologs; ultraconserved elements; GENE PREDICTION; EVOLUTIONARY; ALIGNMENT; AMPLIFICATION; PERFORMANCE; ENRICHMENT; PLANT; TREE; TOOL;
D O I
10.1111/2041-210X.13145
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Phylogenetic studies are increasingly reliant on next-generation sequencing. Transcriptomic and hybrid enrichment sequencing techniques remain the most prevalent methods for phylogenomic data collection due to their relatively low demands for computing powers and sequencing prices, compared to whole-genome sequencing (WGS). However, the transcriptome-based method is constrained by the availability of fresh materials and hybrid enrichment is limited by genomic resources necessary in probe designs, especially for non-model organisms. We present a novel WGS-based pipeline for extracting essential phylogenomic markers through rapid de novo genome assembling from low-coverage genome data, employing a series of computationally efficient bioinformatic tools. We tested the pipeline on a Hexapoda dataset and a more focused Phthiraptera dataset (genome sizes 0.1-2 Gbp), and further investigated the effects of sequencing depth on target assembly success rate based on the raw data of six insect genomes (0.1-1 Gbp). Each genome assembly was completed in 2-24 hr on desktop PCs. We extracted 872-1,615 near-universal single-copy orthologs (Benchmarking Universal Single-Copy Orthologs [BUSCOs]) per species. This method also enables the development of ultraconserved element (UCE) probe sets; we generated probes for Phthiraptera based on our WGS assemblies, containing 55,030 baits targeting 2,832 loci, from which we extracted 2,125-2,272 UCEs. Resulting phylogenetic trees all agreed with the currently accepted topologies, indicating that markers produced in our methods were valid for phylogenomic studies. We also showed that 10-20x sequencing coverage was sufficient to produce hundreds to thousands of targeted loci from BUSCO sets, and an even lower coverage (5x) was required for UCEs. Our study demonstrates the feasibility of conducting phylogenomics from low-coverage WGS for a wide range of organisms without reference genomes. This new approach has major advantages in data collection, particularly in reducing sequencing cost and computing consumption, while expanding loci choices.
引用
收藏
页码:507 / 517
页数:11
相关论文
共 59 条
[1]   Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data [J].
Al-Nakeeb, Kosai ;
Petersen, Thomas Nordahl ;
Sicheritz-Ponten, Thomas .
BMC BIOINFORMATICS, 2017, 18
[2]   Phylogenomics from Whole Genome Sequences Using aTRAM [J].
Allen, Julie M. ;
Boyd, Bret ;
Nam-Phuong Nguyen ;
Vachaspati, Pranjal ;
Warnow, Tandy ;
Huang, Daisie I. ;
Grady, Patrick G. S. ;
Bell, Kayce C. ;
Cronk, Quentin C. B. ;
Mugisha, Lawrence ;
Pittendrigh, Barry R. ;
Soledad Leonardi, M. ;
Reed, David L. ;
Johnson, Kevin P. .
SYSTEMATIC BIOLOGY, 2017, 66 (05) :786-798
[3]   aTRAM - automated target restricted assembly method: a fast method for assembling loci across divergent taxa from next-generation sequencing data [J].
Allen, Julie M. ;
Huang, Daisie I. ;
Cronk, Quentin C. ;
Johnson, Kevin P. .
BMC BIOINFORMATICS, 2015, 16
[4]  
[Anonymous], 2012, ARXIV12034802QBICOGN
[5]   Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales [J].
Bi, Ke ;
Vanderpool, Dan ;
Singhal, Sonal ;
Linderoth, Tyler ;
Moritz, Craig ;
Good, Jeffrey M. .
BMC GENOMICS, 2012, 13
[6]   Variant calling in low-coverage whole genome sequencing of a Native American population sample [J].
Bizon, Chris ;
Spiegel, Michael ;
Chasse, Scott A. ;
Gizer, Ian R. ;
Li, Yun ;
Malc, Ewa P. ;
Mieczkowski, Piotr A. ;
Sailsbery, Josh K. ;
Wang, Xiaoshu ;
Ehlers, Cindy L. ;
Wilhelmsen, Kirk C. .
BMC GENOMICS, 2014, 15
[7]   Enriching the ant tree of life: enhanced UCE bait set for genome-scale phylogenetics of ants and other Hymenoptera [J].
Branstetter, Michael G. ;
Longino, John T. ;
Ward, Philip S. ;
Faircloth, Brant C. .
METHODS IN ECOLOGY AND EVOLUTION, 2017, 8 (06) :768-776
[8]   Targeted Retrieval and Analysis of Five Neandertal mtDNA Genomes [J].
Briggs, Adrian W. ;
Good, Jeffrey M. ;
Green, Richard E. ;
Krause, Johannes ;
Maricic, Tomislav ;
Stenzel, Udo ;
Lalueza-Fox, Carles ;
Rudan, Pavao ;
Brajkovic, Dejana ;
Kucan, Zeljko ;
Gusic, Ivan ;
Schmitz, Ralf ;
Doronichev, Vladimir B. ;
Golovanova, Liubov V. ;
de la Rasilla, Marco ;
Fortea, Javier ;
Rosas, Antonio ;
Paeaebo, Svante .
SCIENCE, 2009, 325 (5938) :318-321
[9]  
Bushnell B., 2014, BBTools
[10]   trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses [J].
Capella-Gutierrez, Salvador ;
Silla-Martinez, Jose M. ;
Gabaldon, Toni .
BIOINFORMATICS, 2009, 25 (15) :1972-1973