Towards more robust methods of alien gene detection

被引:19
作者
Azad, Rajeev K. [1 ]
Lawrence, Jeffrey G. [1 ]
机构
[1] Univ Pittsburgh, Dept Biol Sci, Pittsburgh, PA 15260 USA
基金
美国国家卫生研究院;
关键词
HORIZONTALLY TRANSFERRED GENES; MACHINE LEARNING APPROACH; ESCHERICHIA-COLI GENOME; BACTERIAL GENOMES; PROKARYOTIC GENOMES; MICROBIAL GENOMES; ISLANDS; IDENTIFICATION; DATABASE; HETEROGENEITY;
D O I
10.1093/nar/gkr059
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Because the properties of horizontally-transferred genes will reflect the mutational proclivities of their donor genomes, they often show atypical compositional properties relative to native genes. Parametric methods use these discrepancies to identify bacterial genes recently acquired by horizontal transfer. However, compositional patterns of native genes vary stochastically, leaving no clear boundary between typical and atypical genes. As a result, while strongly atypical genes are readily identified as alien, genes of ambiguous character are poorly classified when a single threshold separates typical and atypical genes. This limitation affects all parametric methods that examine genes independently, and escaping it requires the use of additional genomic information. We propose that the performance of all parametric methods can be improved by using a multiple-threshold approach. First, strongly atypical alien genes and strongly typical native genes would be identified using conservative thresholds. Genes with ambiguous compositional features would then be classified by examining gene context, including the class (native or alien) of flanking genes. By including additional genomic information in a multiple-threshold framework, we observed a remarkable improvement in the performance of several popular, but algorithmically distinct, methods for alien gene detection.
引用
收藏
页数:11
相关论文
共 37 条
[31]   Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier [J].
Sandberg, R ;
Winberg, G ;
Bränden, CI ;
Kaske, A ;
Ernberg, I ;
Cöster, J .
GENOME RESEARCH, 2001, 11 (08) :1404-1409
[32]  
Sridhar Jayavel, 2007, In Silico Biology, V7, P601
[33]   A sensitive, support-vector-machine method for the detection of horizontal gene transfers in viral, archaeal and bacterial genomes [J].
Tsirigos, A ;
Rigoutsos, I .
NUCLEIC ACIDS RESEARCH, 2005, 33 (12) :3699-3707
[34]   A new computational method for the detection of horizontal gene transfer events [J].
Tsirigos, A ;
Rigoutsos, I .
NUCLEIC ACIDS RESEARCH, 2005, 33 (03) :922-933
[35]   Resolving the structural features of genomic islands: A machine learning approach [J].
Vernikos, Georgios S. ;
Parkhill, Julian .
GENOME RESEARCH, 2008, 18 (02) :331-342
[36]   Interpolated variable order motifs for identification of horizontally acquired DNA:: revisiting the Salmonella pathogenicity islands [J].
Vernikos, Georgios S. ;
Parkhill, Julian .
BIOINFORMATICS, 2006, 22 (18) :2196-2203
[37]   A systematic method to identify genomic islands and its applications in analyzing the genomes of Corynebacterium glutamicum and Vibrio vulnificus CMCP6 chromosome I [J].
Zhang, R ;
Zhang, CT .
BIOINFORMATICS, 2004, 20 (05) :612-U70