Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering

被引:11
作者
Bohlin, Jon [1 ]
Skjerve, Eystein [1 ]
Ussery, David W. [2 ]
机构
[1] Norwegian Sch Vet Sci, N-0033 Oslo, Norway
[2] Tech Univ Denmark, Ctr Biol Sequence Anal, DK-2800 Lyngby, Denmark
关键词
EVOLUTIONARY IMPLICATIONS; MICROBIAL COMMUNITIES; USAGE PATTERNS; BACTERIAL; METAGENOMICS; FREQUENCIES; SEQUENCES; BIASES;
D O I
10.1186/1471-2164-10-487
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Results: Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement. Conclusion: The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.
引用
收藏
页数:9
相关论文
共 27 条
[1]   Investigations of oligonucleotide usage variance within and between prokaryotes [J].
Bohlin, Jon ;
Skjerve, Eystein ;
Ussery, David W. .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (04)
[2]   Reliability and applications of statistical methods based on oligonucleotide frequencies in bacterial and archaeal genomes [J].
Bohlin, Jon ;
Skjerve, Eystein ;
Ussery, David W. .
BMC GENOMICS, 2008, 9 (1)
[3]   Use of the genomic signature in bacterial classification and identification [J].
Coenye, T ;
Vandamme, P .
SYSTEMATIC AND APPLIED MICROBIOLOGY, 2004, 27 (02) :175-185
[4]   Towards a prokaryotic genomic taxonomy [J].
Coenye, T ;
Gevers, D ;
Van de Peer, Y ;
Vandamme, P ;
Swings, J .
FEMS MICROBIOLOGY REVIEWS, 2005, 29 (02) :147-167
[5]   Detection and characterization of horizontal transfers in prokaryotes using genomic signature [J].
Dufraigne, C ;
Fertil, B ;
Lespinats, S ;
Giron, A ;
Deschavanne, P .
NUCLEIC ACIDS RESEARCH, 2005, 33 (01) :e6
[6]   Bacterial genome sequencing and its use in infectious diseases [J].
Fournier, Pierre-Edouard ;
Drancourt, Michel ;
Raoult, Didier .
LANCET INFECTIOUS DISEASES, 2007, 7 (11) :711-723
[7]   Comparative DNA analysis across diverse genomes [J].
Karlin, S ;
Campbell, AM ;
Mrázek, J .
ANNUAL REVIEW OF GENETICS, 1998, 32 :185-225
[8]  
KARLIN S, 1995, TRENDS GENET, V11, P283
[9]   Compositional differences within and between eukaryotic genomes [J].
Karlin, S ;
Mrazek, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (19) :10227-10232
[10]   Compositional biases of bacterial genomes and evolutionary implications [J].
Karlin, S ;
Mrazek, J ;
Campbell, AM .
JOURNAL OF BACTERIOLOGY, 1997, 179 (12) :3899-3913