Intrinsic correlation of oligonucleotides: A novel genomic signature for metagenome analysis

被引:5
作者
Ding, Xiao [1 ]
Cao, Chang-Chang [1 ]
Sun, Xiao [1 ]
机构
[1] Southeast Univ, Sch Biol Sci & Med Engn, State Key Lab Bioelect, Nanjing, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Genomic signature; Sequence correlation; Microbial species discrimination; CODON USAGE; SELECTION; REPRESENTATION;
D O I
10.1016/j.jtbi.2014.02.039
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Because a vast majority (99%) of microbes in a given community is likely to be non-cultivable, metagenomics has gradually entered the mainstream of microbial research methods. With the development of high-throughput sequencing techniques, an increasing number of sequencing read data sets of metagenomes from various microbial communities have become available. For these data sets, metagenomic analysis based on mapping reads to microbial genomes has been hampered by the limited number of microbial genomes that are available. Further, this type of analysis is computationally intensive. Thus alignment-free methods, which characterize the sequencing reads with a genomic signature instead of with genomic alignments, can be applied. However, the main requirement of these alignment-free methods is a stable genomic signature that performs reliably. Here, we propose a novel genomic signature of microbial genomes called the intrinsic correlation of oligonucleotides (ICOs). This signature represents the quantification of an intrinsic relationship between any two oligonucleotides. We analyzed microbial genomes at different taxonomic levels using ICO profiles and confirmed the wide availability of useful ICOs. We used intra-genomic and inter-genomic distances and relational grades to evaluate the performance of ICOs as a genomic signature. The results of these experiments showed that ICOs can characterize microbial genomes well, and ICOs were better at distinguishing species than tetranucleotide composition, not only in terms of whole genomes but also in terms of sequence fragments. In addition, we evaluated the performance of a hybrid feature that combined ICOs and tetranucleotide composition. The experimental results showed that the hybrid feature performed better than ICOs or tetranucleotide composition alone. ICOs can characterize microbial genomes successfully and are capable of distinguishing organisms at different taxonomic levels. ICOs perform better than tetranucleotide composition in characterizing microbial genomes. The hybrid feature that used a combination of the two kinds of sequence features had advantages over a single sequence feature. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:9 / 18
页数:10
相关论文
共 20 条
[1]   The average mutual information profile as a genomic signature [J].
Bauer, Mark ;
Schuster, Sheldon M. ;
Sayood, Khalid .
BMC BIOINFORMATICS, 2008, 9 (1)
[2]   Bioinformatics for whole-genome shotgun sequencing of microbial communities [J].
Chen, K ;
Pachter, L .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (02) :106-112
[3]   Taxonomic binning of metagenome samples generated by next-generation sequencing technologies [J].
Droege, Johannes ;
McHardy, Alice C. .
BRIEFINGS IN BIOINFORMATICS, 2012, 13 (06) :646-655
[4]   Metagenomics: A Quantum Jump from Bacterial Genomics [J].
Gupta, Puja ;
Vakhlu, Jyoti .
INDIAN JOURNAL OF MICROBIOLOGY, 2011, 51 (04) :539-541
[5]  
HAMORI E, 1983, J BIOL CHEM, V258, P1318
[6]   COMPARISONS OF EUKARYOTIC GENOMIC SEQUENCES [J].
KARLIN, S ;
LADUNGA, I .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (26) :12832-12836
[7]   Initial sequencing and analysis of the human genome [J].
Lander, ES ;
Int Human Genome Sequencing Consortium ;
Linton, LM ;
Birren, B ;
Nusbaum, C ;
Zody, MC ;
Baldwin, J ;
Devon, K ;
Dewar, K ;
Doyle, M ;
FitzHugh, W ;
Funke, R ;
Gage, D ;
Harris, K ;
Heaford, A ;
Howland, J ;
Kann, L ;
Lehoczky, J ;
LeVine, R ;
McEwan, P ;
McKernan, K ;
Meldrim, J ;
Mesirov, JP ;
Miranda, C ;
Morris, W ;
Naylor, J ;
Raymond, C ;
Rosetti, M ;
Santos, R ;
Sheridan, A ;
Sougnez, C ;
Stange-Thomann, N ;
Stojanovic, N ;
Subramanian, A ;
Wyman, D ;
Rogers, J ;
Sulston, J ;
Ainscough, R ;
Beck, S ;
Bentley, D ;
Burton, J ;
Clee, C ;
Carter, N ;
Coulson, A ;
Deadman, R ;
Deloukas, P ;
Dunham, A ;
Dunham, I ;
Durbin, R ;
French, L .
NATURE, 2001, 409 (6822) :860-921
[8]   A novel feature-based method for whole genome phylogenetic analysis without alignment: Application to HEV genotyping and subtyping [J].
Liu, Zhihua ;
Meng, Jihong ;
Sun, Xiao .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2008, 368 (02) :223-230
[9]   Three-base periodicity patterns and self-similarity in whole bacterial chromosomes [J].
López-Villaseñor, I ;
José, MV ;
Sánchez, J .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2004, 325 (02) :467-478
[10]   IMG: the integrated microbial genomes database and comparative analysis system [J].
Markowitz, Victor M. ;
Chen, I-Min A. ;
Palaniappan, Krishna ;
Chu, Ken ;
Szeto, Ernest ;
Grechkin, Yuri ;
Ratner, Anna ;
Jacob, Biju ;
Huang, Jinghua ;
Williams, Peter ;
Huntemann, Marcel ;
Anderson, Iain ;
Mavromatis, Konstantinos ;
Ivanova, Natalia N. ;
Kyrpides, Nikos C. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D115-D122