A clustering method for next-generation sequences of bacterial genomes through multiomics data mapping

被引：0

作者：

Ho-Sik Seok

Mikang Sim

Daehwan Lee

Jaebum Kim

机构：

[1] Konkuk University,Department of Animal Biotechnology

[2] Konkuk University,Department of Animal Biotechnology, UBITA Center for Biotechnology Research (CBRU)

来源：

Genes & Genomics | 2014年 / 36卷

关键词：

Next-generation sequence assembly; Multiomics data; Read clustering;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

With various ‘omics’ data becoming available recently, new challenges and opportunities are provided for researches on the assembly of next-generation sequences. As an attempt to utilize novel opportunities, we developed a next-generation sequence clustering method focusing on interdependency between genomics and proteomics data. Under the assumption that we can obtain next-generation read sequences and proteomics data of a target species, we mapped the read sequences against protein sequences and found physically adjacent reads based on a machine learning-based read assignment method. We measured the performance of our method by using simulated read sequences and collected protein sequences of Escherichia coli (E. coli). Here, we concentrated on the actual adjacency of the clustered reads in the E. coli genome and found that (i) the proposed method improves the performance of read clustering and (ii) the use of proteomics data does have a potential for enhancing the performance of genome assemblers. These results demonstrate that the integrative approach is effective for the accurate grouping of adjacent reads in a genome, which will result in a better genome assembly.

引用

页码：191 / 196

页数：5

共 50 条

[1] A clustering method for next-generation sequences of bacterial genomes through multiomics data mapping
Seok, Ho-Sik
Sim, Mikang
Lee, Daehwan
Kim, Jaebum
GENES & GENOMICS, 2014, 36 (02) : 191 - 196
[2] Metagenome assembly through clustering of next-generation sequencing data using protein sequences
Sim, Mikang
Kim, Jaebum
JOURNAL OF MICROBIOLOGICAL METHODS, 2015, 109 : 180 - 187
[3] A next-generation sequence clustering method for E. coli through proteomics-genomics data mapping
Sim, Mikang
Seok, Ho-Sik
Kim, Jaebum
4TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS-BIOLOGY AND BIOINFORMATICS (CSBIO2013), 2013, 23 : 96 - 101
[4] SEED: efficient clustering of next-generation sequences
Bao, Ergude
Jiang, Tao
Kaloshian, Isgouhi
Girke, Thomas
BIOINFORMATICS, 2011, 27 (18) : 2502 - 2509
[5] Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data
Petr Novák
Pavel Neumann
Jiří Macas
BMC Bioinformatics, 11
[6] Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data
Novak, Petr
Neumann, Pavel
Macas, Jiri
BMC BIOINFORMATICS, 2010, 11
[7] Analysis of cancer genomes through microarrays and next-generation sequencing
Liu, Li
So, Alex Yick-Lun
Fan, Jian-Bing
TRANSLATIONAL CANCER RESEARCH, 2015, 4 (03) : 212 - 218
[8] Estimating the composition of species in metagenomes by clustering of next-generation read sequences
Seok, Ho-Sik
Hong, Woonyoung
Kim, Jaebum
METHODS, 2014, 69 (03) : 213 - 219
[9] Estimating the Number of Species in Metagenomes by Clustering Next-Generation Read Sequences
Seok, Ho-Sik
Hong, Woonyoung
Kim, Jaebum
2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 52 - +
[10] Analysis of next-generation sequencing data for cancer genomes: challenges and pitfalls
Wang, Jianmin
Chen, Xiang
Wu, Gang
Rusch, Michael C.
Parker, Matthew
Lei, Wei
Downing, James R.
Zhang, Jinghui
CANCER RESEARCH, 2012, 72

← 1 2 3 4 5 →