A clustering method for next-generation sequences of bacterial genomes through multiomics data mapping

被引:0
|
作者
Ho-Sik Seok
Mikang Sim
Daehwan Lee
Jaebum Kim
机构
[1] Konkuk University,Department of Animal Biotechnology
[2] Konkuk University,Department of Animal Biotechnology, UBITA Center for Biotechnology Research (CBRU)
来源
Genes & Genomics | 2014年 / 36卷
关键词
Next-generation sequence assembly; Multiomics data; Read clustering;
D O I
暂无
中图分类号
学科分类号
摘要
With various ‘omics’ data becoming available recently, new challenges and opportunities are provided for researches on the assembly of next-generation sequences. As an attempt to utilize novel opportunities, we developed a next-generation sequence clustering method focusing on interdependency between genomics and proteomics data. Under the assumption that we can obtain next-generation read sequences and proteomics data of a target species, we mapped the read sequences against protein sequences and found physically adjacent reads based on a machine learning-based read assignment method. We measured the performance of our method by using simulated read sequences and collected protein sequences of Escherichia coli (E. coli). Here, we concentrated on the actual adjacency of the clustered reads in the E. coli genome and found that (i) the proposed method improves the performance of read clustering and (ii) the use of proteomics data does have a potential for enhancing the performance of genome assemblers. These results demonstrate that the integrative approach is effective for the accurate grouping of adjacent reads in a genome, which will result in a better genome assembly.
引用
收藏
页码:191 / 196
页数:5
相关论文
共 50 条
  • [1] A clustering method for next-generation sequences of bacterial genomes through multiomics data mapping
    Seok, Ho-Sik
    Sim, Mikang
    Lee, Daehwan
    Kim, Jaebum
    GENES & GENOMICS, 2014, 36 (02) : 191 - 196
  • [2] Metagenome assembly through clustering of next-generation sequencing data using protein sequences
    Sim, Mikang
    Kim, Jaebum
    JOURNAL OF MICROBIOLOGICAL METHODS, 2015, 109 : 180 - 187
  • [3] A next-generation sequence clustering method for E. coli through proteomics-genomics data mapping
    Sim, Mikang
    Seok, Ho-Sik
    Kim, Jaebum
    4TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS-BIOLOGY AND BIOINFORMATICS (CSBIO2013), 2013, 23 : 96 - 101
  • [4] SEED: efficient clustering of next-generation sequences
    Bao, Ergude
    Jiang, Tao
    Kaloshian, Isgouhi
    Girke, Thomas
    BIOINFORMATICS, 2011, 27 (18) : 2502 - 2509
  • [5] Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data
    Petr Novák
    Pavel Neumann
    Jiří Macas
    BMC Bioinformatics, 11
  • [6] Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data
    Novak, Petr
    Neumann, Pavel
    Macas, Jiri
    BMC BIOINFORMATICS, 2010, 11
  • [7] Analysis of cancer genomes through microarrays and next-generation sequencing
    Liu, Li
    So, Alex Yick-Lun
    Fan, Jian-Bing
    TRANSLATIONAL CANCER RESEARCH, 2015, 4 (03) : 212 - 218
  • [8] Estimating the composition of species in metagenomes by clustering of next-generation read sequences
    Seok, Ho-Sik
    Hong, Woonyoung
    Kim, Jaebum
    METHODS, 2014, 69 (03) : 213 - 219
  • [9] Estimating the Number of Species in Metagenomes by Clustering Next-Generation Read Sequences
    Seok, Ho-Sik
    Hong, Woonyoung
    Kim, Jaebum
    2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 52 - +
  • [10] Analysis of next-generation sequencing data for cancer genomes: challenges and pitfalls
    Wang, Jianmin
    Chen, Xiang
    Wu, Gang
    Rusch, Michael C.
    Parker, Matthew
    Lei, Wei
    Downing, James R.
    Zhang, Jinghui
    CANCER RESEARCH, 2012, 72