HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly

被引:129
作者
Huang, Shengfeng [1 ]
Kang, Mingjing [1 ]
Xu, Anlong [1 ]
机构
[1] Sun Yat Sen Univ, Sch Life Sci, Guangdong Key Lab Pharmaceut Funct Genes, State Key Lab Biocontrol, Guangzhou 510275, Guangdong, Peoples R China
关键词
D O I
10.1093/bioinformatics/btx220
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
De novo assembly is a difficult issue for heterozygous diploid genomes. The advent of high-throughput short-read and long-read sequencing technologies provides both new challenges and potential solutions to the issue. Here, we present HaploMerger2 (HM2), an automated pipeline for rebuilding both haploid sub-assemblies from the polymorphic diploid genome assembly. It is designed to work on pre-existing diploid assemblies, which are typically created by using de novo assemblers. HM2 can process any diploid assemblies, but it is especially suitable for diploid assemblies with high heterozygosity (> 3%), which can be difficult for other tools. This pipeline also implements flexible and sensitive assembly error detection, a hierarchical scaffolding procedure and a reliable gap-closing method for haploid sub-assemblies. Using HM2, we demonstrate that two haploid sub-assemblies reconstructed from a real, highly-polymorphic diploid assembly show greatly improved continuity. Availability and Implementation: Source code, executables and the testing dataset are freely available at https://github.com/mapleforest/HaploMerger2/releases/.
引用
收藏
页码:2577 / 2579
页数:3
相关论文
共 19 条
  • [1] Assembling large genomes with single-molecule sequencing and locality-sensitive hashing
    Berlin, Konstantin
    Koren, Sergey
    Chin, Chen-Shan
    Drake, James P.
    Landolin, Jane M.
    Phillippy, Adam M.
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (06) : 623 - +
  • [2] Scaffolding pre-assembled contigs using SSPACE
    Boetzer, Marten
    Henkel, Christiaan V.
    Jansen, Hans J.
    Butler, Derek
    Pirovano, Walter
    [J]. BIOINFORMATICS, 2011, 27 (04) : 578 - 579
  • [3] Chen Nansheng, 2004, Curr Protoc Bioinformatics, VChapter 4, DOI 10.1002/0471250953.bi0410s05
  • [4] Chin CS, 2016, NAT METHODS, V13, P1050, DOI [10.1038/NMETH.4035, 10.1038/nmeth.4035]
  • [5] Chin CS, 2013, NAT METHODS, V10, P563, DOI [10.1038/nmeth.2474, 10.1038/NMETH.2474]
  • [6] High-quality draft assemblies of mammalian genomes from massively parallel sequence data
    Gnerre, Sante
    MacCallum, Iain
    Przybylski, Dariusz
    Ribeiro, Filipe J.
    Burton, Joshua N.
    Walker, Bruce J.
    Sharpe, Ted
    Hall, Giles
    Shea, Terrance P.
    Sykes, Sean
    Berlin, Aaron M.
    Aird, Daniel
    Costello, Maura
    Daza, Riza
    Williams, Louise
    Nicol, Robert
    Gnirke, Andreas
    Nusbaum, Chad
    Lander, Eric S.
    Jaffe, David B.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (04) : 1513 - 1518
  • [7] Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes
    Huang, Shengfeng
    Chen, Zelin
    Yan, Xinyu
    Yu, Ting
    Huang, Guangrui
    Yan, Qingyu
    Pontarotti, Pierre Antoine
    Zhao, Hongchen
    Li, Jie
    Yang, Ping
    Wang, Ruihua
    Li, Rui
    Tao, Xin
    Deng, Ting
    Wang, Yiquan
    Li, Guang
    Zhang, Qiujin
    Zhou, Sisi
    You, Leiming
    Yuan, Shaochun
    Fu, Yonggui
    Wu, Fenfang
    Dong, Meiling
    Chen, Shangwu
    Xu, Anlong
    [J]. NATURE COMMUNICATIONS, 2014, 5 : 5896
  • [8] HaploMerger: Reconstructing allelic relationships for polymorphic diploid genome assemblies
    Huang, Shengfeng
    Chen, Zelin
    Huang, Guangrui
    Yu, Ting
    Yang, Ping
    Li, Jie
    Fu, Yonggui
    Yuan, Shaochun
    Chen, Shangwu
    Xu, Anlong
    [J]. GENOME RESEARCH, 2012, 22 (08) : 1581 - 1588
  • [9] Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads
    Kajitani, Rei
    Toshimoto, Kouta
    Noguchi, Hideki
    Toyoda, Atsushi
    Ogura, Yoshitoshi
    Okuno, Miki
    Yabana, Mitsuru
    Harada, Masayuki
    Nagayasu, Eiji
    Maruyama, Haruhiko
    Kohara, Yuji
    Fujiyama, Asao
    Hayashi, Tetsuya
    Itoh, Takehiko
    [J]. GENOME RESEARCH, 2014, 24 (08) : 1384 - 1395
  • [10] Koren S., 2017, BIORXIV