HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly

被引:135
作者
Huang, Shengfeng [1 ]
Kang, Mingjing [1 ]
Xu, Anlong [1 ]
机构
[1] Sun Yat Sen Univ, Sch Life Sci, Guangdong Key Lab Pharmaceut Funct Genes, State Key Lab Biocontrol, Guangzhou 510275, Guangdong, Peoples R China
关键词
D O I
10.1093/bioinformatics/btx220
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
De novo assembly is a difficult issue for heterozygous diploid genomes. The advent of high-throughput short-read and long-read sequencing technologies provides both new challenges and potential solutions to the issue. Here, we present HaploMerger2 (HM2), an automated pipeline for rebuilding both haploid sub-assemblies from the polymorphic diploid genome assembly. It is designed to work on pre-existing diploid assemblies, which are typically created by using de novo assemblers. HM2 can process any diploid assemblies, but it is especially suitable for diploid assemblies with high heterozygosity (> 3%), which can be difficult for other tools. This pipeline also implements flexible and sensitive assembly error detection, a hierarchical scaffolding procedure and a reliable gap-closing method for haploid sub-assemblies. Using HM2, we demonstrate that two haploid sub-assemblies reconstructed from a real, highly-polymorphic diploid assembly show greatly improved continuity. Availability and Implementation: Source code, executables and the testing dataset are freely available at https://github.com/mapleforest/HaploMerger2/releases/.
引用
收藏
页码:2577 / 2579
页数:3
相关论文
共 19 条
[1]   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing [J].
Berlin, Konstantin ;
Koren, Sergey ;
Chin, Chen-Shan ;
Drake, James P. ;
Landolin, Jane M. ;
Phillippy, Adam M. .
NATURE BIOTECHNOLOGY, 2015, 33 (06) :623-+
[2]   Scaffolding pre-assembled contigs using SSPACE [J].
Boetzer, Marten ;
Henkel, Christiaan V. ;
Jansen, Hans J. ;
Butler, Derek ;
Pirovano, Walter .
BIOINFORMATICS, 2011, 27 (04) :578-579
[3]  
Chen Nansheng, 2004, Curr Protoc Bioinformatics, VChapter 4, DOI 10.1002/0471250953.bi0410s05
[4]  
Chin CS, 2016, NAT METHODS, V13, P1050, DOI [10.1038/NMETH.4035, 10.1038/nmeth.4035]
[5]  
Chin CS, 2013, NAT METHODS, V10, P563, DOI [10.1038/nmeth.2474, 10.1038/NMETH.2474]
[6]   High-quality draft assemblies of mammalian genomes from massively parallel sequence data [J].
Gnerre, Sante ;
MacCallum, Iain ;
Przybylski, Dariusz ;
Ribeiro, Filipe J. ;
Burton, Joshua N. ;
Walker, Bruce J. ;
Sharpe, Ted ;
Hall, Giles ;
Shea, Terrance P. ;
Sykes, Sean ;
Berlin, Aaron M. ;
Aird, Daniel ;
Costello, Maura ;
Daza, Riza ;
Williams, Louise ;
Nicol, Robert ;
Gnirke, Andreas ;
Nusbaum, Chad ;
Lander, Eric S. ;
Jaffe, David B. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (04) :1513-1518
[7]   Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes [J].
Huang, Shengfeng ;
Chen, Zelin ;
Yan, Xinyu ;
Yu, Ting ;
Huang, Guangrui ;
Yan, Qingyu ;
Pontarotti, Pierre Antoine ;
Zhao, Hongchen ;
Li, Jie ;
Yang, Ping ;
Wang, Ruihua ;
Li, Rui ;
Tao, Xin ;
Deng, Ting ;
Wang, Yiquan ;
Li, Guang ;
Zhang, Qiujin ;
Zhou, Sisi ;
You, Leiming ;
Yuan, Shaochun ;
Fu, Yonggui ;
Wu, Fenfang ;
Dong, Meiling ;
Chen, Shangwu ;
Xu, Anlong .
NATURE COMMUNICATIONS, 2014, 5 :5896
[8]   HaploMerger: Reconstructing allelic relationships for polymorphic diploid genome assemblies [J].
Huang, Shengfeng ;
Chen, Zelin ;
Huang, Guangrui ;
Yu, Ting ;
Yang, Ping ;
Li, Jie ;
Fu, Yonggui ;
Yuan, Shaochun ;
Chen, Shangwu ;
Xu, Anlong .
GENOME RESEARCH, 2012, 22 (08) :1581-1588
[9]   Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads [J].
Kajitani, Rei ;
Toshimoto, Kouta ;
Noguchi, Hideki ;
Toyoda, Atsushi ;
Ogura, Yoshitoshi ;
Okuno, Miki ;
Yabana, Mitsuru ;
Harada, Masayuki ;
Nagayasu, Eiji ;
Maruyama, Haruhiko ;
Kohara, Yuji ;
Fujiyama, Asao ;
Hayashi, Tetsuya ;
Itoh, Takehiko .
GENOME RESEARCH, 2014, 24 (08) :1384-1395
[10]  
Koren S., 2017, BIORXIV