GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes

被引:1034
作者
Ranallo-Benavidez, T. Rhyker [1 ]
Jaron, Kamil S. [2 ,3 ]
Schatz, Michael C. [1 ,4 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Univ Lausanne, Lausanne, Switzerland
[3] Swiss Inst Bioinformat, Lausanne, Switzerland
[4] Cold Spring Harbor Lab, New York, NY USA
基金
瑞士国家科学基金会;
关键词
ABUNDANCE; UNCOVERS; SEQUENCE; QUALITY;
D O I
10.1038/s41467-020-14998-3
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
An important assessment prior to genome assembly and related analyses is genome profiling, where the k-mer frequencies within raw sequencing reads are analyzed to estimate major genome characteristics such as size, heterozygosity, and repetitiveness. Here we introduce GenomeScope 2.0 (https://github.com/tbenavi1/genomescope2.0), which applies combinatorial theory to establish a detailed mathematical model of how k-mer frequencies are distributed in heterozygous and polyploid genomes. We describe and evaluate a practical implementation of the polyploid-aware mixture model that quickly and accurately infers genome properties across thousands of simulated and several real datasets spanning a broad range of complexity. We also present a method called Smudgeplot (https://github.com/KamilSJaron/smudgeplot) to visualize and estimate the ploidy and genome structure of a genome by analyzing heterozygous k-mer pairs. We successfully apply the approach to systems of known variable ploidy levels in the Meloidogyne genus and the extreme case of octoploid Fragariaxananassa. Prior to genome assembly, the raw sequencing reads must be analyzed for assessment of major genome characteristics such as genome size, heterozygosity, and repetitiveness. For this purpose, the authors introduce GenomeScope 2.0, an extension of GenomeScope for polyploid genomes, and Smudgeplot, which can estimate a genome's ploidy.
引用
收藏
页数:10
相关论文
共 36 条
[1]   Genome sequence of the metazoan plant-parasitic nematode Meloidogyne incognita [J].
Abad, Pierre ;
Gouzy, Jerome ;
Aury, Jean-Marc ;
Castagnone-Sereno, Philippe ;
Danchin, Etienne G. J. ;
Deleury, Emeline ;
Perfus-Barbeoch, Laetitia ;
Anthouard, Veronique ;
Artiguenave, Francois ;
Blok, Vivian C. ;
Caillaud, Marie-Cecile ;
Coutinho, Pedro M. ;
Dasilva, Corinne ;
De Luca, Francesca ;
Deau, Florence ;
Esquibet, Magali ;
Flutre, Timothe ;
Goldstone, Jared V. ;
Hamamouch, Noureddine ;
Hewezi, Tarek ;
Jaillon, Olivier ;
Jubin, Claire ;
Leonetti, Paola ;
Magliano, Marc ;
Maier, Tom R. ;
Markov, Gabriel V. ;
McVeigh, Paul ;
Pesole, Graziano ;
Poulain, Julie ;
Robinson-Rechavi, Marc ;
Sallet, Erika ;
Segurens, Beatrice ;
Steinbach, Delphine ;
Tytgat, Tom ;
Ugarte, Edgardo ;
van Ghelder, Cyril ;
Veronico, Pasqua ;
Baum, Thomas J. ;
Blaxter, Mark ;
Bleve-Zacheo, Teresa ;
Davis, Eric L. ;
Ewbank, Jonathan J. ;
Favery, Bruno ;
Grenier, Eric ;
Henrissat, Bernard ;
Jones, John T. ;
Laudet, Vincent ;
Maule, Aaron G. ;
Quesneville, Hadi ;
Rosso, Marie-Noelle .
NATURE BIOTECHNOLOGY, 2008, 26 (08) :909-915
[2]   T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks [J].
Alix, Boc ;
Boubacar, Diallo Alpha ;
Vladimir, Makarenkov .
NUCLEIC ACIDS RESEARCH, 2012, 40 (W1) :W573-W579
[3]  
[Anonymous], 2017, Microbiol Spectr, DOI DOI 10.1128/MICROBIOLSPEC.FUNK-0051-2016
[4]   The "Polyploid Hop": Shifting Challenges and Opportunities Over the Evolutionary Lifespan of Genome Duplications [J].
Baduel, Pierre ;
Bray, Sian ;
Vallejo-Marin, Mario ;
Kolar, Filip ;
Yant, Levi .
FRONTIERS IN ECOLOGY AND EVOLUTION, 2018, 6
[5]   SNP genotyping and parameter estimation in polyploids using low-coverage sequencing data [J].
Blischak, Paul D. ;
Kubatko, Laura S. ;
Wolfe, Andrea D. .
BIOINFORMATICS, 2018, 34 (03) :407-415
[6]   Informed and automated k-mer size selection for genome assembly [J].
Chikhi, Rayan ;
Medvedev, Paul .
BIOINFORMATICS, 2014, 30 (01) :31-37
[7]  
Chin CS, 2016, NAT METHODS, V13, P1050, DOI [10.1038/nmeth.4035, 10.1038/NMETH.4035]
[8]  
Claros Manuel Gonzalo, 2012, Biology (Basel), V1, P439, DOI 10.3390/biology1020439
[9]  
ENDRIZZI JE, 1962, EVOLUTION, V16, P325, DOI 10.1111/j.1558-5646.1962.tb03224.x
[10]   HYBRID ORIGIN OF POLYPLOIDY IN FRESH-WATER SNAILS OF THE GENUS BULINUS (MOLLUSCA, PLANORBIDAE) [J].
GOLDMAN, MA ;
LOVERDE, PT ;
CHRISMAN, CL .
EVOLUTION, 1983, 37 (03) :592-600