Metagenome assembly through clustering of next-generation sequencing data using protein sequences

被引:1
|
作者
Sim, Mikang [1 ]
Kim, Jaebum [1 ]
机构
[1] Konkuk Univ, Dept Anim Biotechnol, Seoul 143701, South Korea
基金
新加坡国家研究基金会;
关键词
Metagenome assembly; Next-generation sequencing; Sequence clustering; Protein sequences; SHORT DNA-SEQUENCES; DE-NOVO ASSEMBLER; GENOME SEQUENCE; SINGLE-CELL; METAPROTEOMICS; ALGORITHM; MILLIONS; VELVET; COLI; IDBA;
D O I
10.1016/j.mimet.2015.01.002
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The study of environmental microbial communities, called metagenomics, has gained a lot of attention because of the recent advances in next-generation sequencing (NGS) technologies. Microbes play a critical role in changing their environments, and the mode of their effect can be solved by investigating metagenomes. However, the difficulty of metagenomes, such as the combination of multiple microbes and different species abundance, makes metagenome assembly tasks more challenging. In this paper, we developed a new metagenome assembly method by utilizing protein sequences, in addition to the NGS read sequences. Our method (i) builds read clusters by using mapping information against available protein sequences, and (ii) creates contig sequences by finding consensus sequences through probabilistic choices from the read clusters. By using simulated NGS read sequences from real microbial genome sequences, we evaluated our method in comparison with four existing assembly programs. We found that our method could generate relatively long and accurate metagenome assemblies, indicating that the idea of using protein sequences, as a guide for the assembly, is promising. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:180 / 187
页数:8
相关论文
共 50 条
  • [1] Assembly of repetitive regions using next-generation sequencing data
    Nowak, Robert M.
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2015, 35 (04) : 276 - 283
  • [2] Assembly algorithms for next-generation sequencing data
    Miller, Jason R.
    Koren, Sergey
    Sutton, Granger
    GENOMICS, 2010, 95 (06) : 315 - 327
  • [3] A Clustering Approach for DeNovo Assembly using Next Generation Sequencing Data
    Kchouk, Mehdi
    Elloumi, Mourad
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1909 - 1911
  • [4] Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data
    Petr Novák
    Pavel Neumann
    Jiří Macas
    BMC Bioinformatics, 11
  • [5] Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data
    Novak, Petr
    Neumann, Pavel
    Macas, Jiri
    BMC BIOINFORMATICS, 2010, 11
  • [6] The Genome Assembly Model for Next-Generation Sequencing Data
    Wang, Yirong
    Wei, Chengdong
    Zhang, Xiaodong
    Cen, Tailin
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING AND STATISTICS APPLICATION (AMMSA 2017), 2017, 141 : 97 - 101
  • [7] First insights into the metagenome of Egyptian mummies using next-generation sequencing
    Khairat, Rabab
    Ball, Markus
    Chang, Chun-Chi Hsieh
    Bianucci, Raffaella
    Nerlich, Andreas G.
    Trautmann, Martin
    Ismail, Somaia
    Shanab, Gamila M. L.
    Karim, Amr M.
    Gad, Yehia Z.
    Pusch, Carsten M.
    JOURNAL OF APPLIED GENETICS, 2013, 54 (03) : 309 - 325
  • [8] First insights into the metagenome of Egyptian mummies using next-generation sequencing
    Rabab Khairat
    Markus Ball
    Chun-Chi Hsieh Chang
    Raffaella Bianucci
    Andreas G. Nerlich
    Martin Trautmann
    Somaia Ismail
    Gamila M. L. Shanab
    Amr M. Karim
    Yehia Z. Gad
    Carsten M. Pusch
    Journal of Applied Genetics, 2013, 54 : 309 - 325
  • [9] A clustering method for next-generation sequences of bacterial genomes through multiomics data mapping
    Ho-Sik Seok
    Mikang Sim
    Daehwan Lee
    Jaebum Kim
    Genes & Genomics, 2014, 36 : 191 - 196
  • [10] A clustering method for next-generation sequences of bacterial genomes through multiomics data mapping
    Seok, Ho-Sik
    Sim, Mikang
    Lee, Daehwan
    Kim, Jaebum
    GENES & GENOMICS, 2014, 36 (02) : 191 - 196