Towards more robust methods of alien gene detection

被引:19
作者
Azad, Rajeev K. [1 ]
Lawrence, Jeffrey G. [1 ]
机构
[1] Univ Pittsburgh, Dept Biol Sci, Pittsburgh, PA 15260 USA
基金
美国国家卫生研究院;
关键词
HORIZONTALLY TRANSFERRED GENES; MACHINE LEARNING APPROACH; ESCHERICHIA-COLI GENOME; BACTERIAL GENOMES; PROKARYOTIC GENOMES; MICROBIAL GENOMES; ISLANDS; IDENTIFICATION; DATABASE; HETEROGENEITY;
D O I
10.1093/nar/gkr059
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Because the properties of horizontally-transferred genes will reflect the mutational proclivities of their donor genomes, they often show atypical compositional properties relative to native genes. Parametric methods use these discrepancies to identify bacterial genes recently acquired by horizontal transfer. However, compositional patterns of native genes vary stochastically, leaving no clear boundary between typical and atypical genes. As a result, while strongly atypical genes are readily identified as alien, genes of ambiguous character are poorly classified when a single threshold separates typical and atypical genes. This limitation affects all parametric methods that examine genes independently, and escaping it requires the use of additional genomic information. We propose that the performance of all parametric methods can be improved by using a multiple-threshold approach. First, strongly atypical alien genes and strongly typical native genes would be identified using conservative thresholds. Genes with ambiguous compositional features would then be classified by examining gene context, including the class (native or alien) of flanking genes. By including additional genomic information in a multiple-threshold framework, we observed a remarkable improvement in the performance of several popular, but algorithmically distinct, methods for alien gene detection.
引用
收藏
页数:11
相关论文
共 37 条
  • [1] NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION
    AKAIKE, H
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) : 716 - 723
  • [2] Detection of genomic islands via segmental genome heterogeneity
    Arvey, Aaron J.
    Azad, Rajeev K.
    Raval, Alpan
    Lawrence, Jeffrey G.
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 (16) : 5255 - 5266
  • [3] Detecting laterally transferred genes: use of entropic clustering methods and genome position
    Azad, Rajeev K.
    Lawrence, Jeffrey G.
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 (14) : 4629 - 4639
  • [4] Use of artificial genomes in assessing methods for atypical gene detection
    Azad, RK
    Lawrence, JG
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (06) : 461 - 473
  • [5] On detection and assessment of statistical significance of Genomic Islands
    Chatterjee, Raghunath
    Chaudhuri, Keya
    Chaudhuri, Probal
    [J]. BMC GENOMICS, 2008, 9 (1)
  • [6] Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops
    Chiapello, H
    Bourgait, I
    Sourivong, F
    Heuclin, G
    Gendrault-Jacquemard, A
    Petit, MA
    El Karoui, M
    [J]. BMC BIOINFORMATICS, 2005, 6 (1)
  • [7] MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level
    Chiapello, Helene
    Gendrault, Annie
    Caron, Christophe
    Blum, Jerome
    Petit, Marie-Agnes
    El Karoui, Meriem
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [8] A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes
    Cortez, Diego
    Forterre, Patrick
    Gribaldo, Simonetta
    [J]. GENOME BIOLOGY, 2009, 10 (06):
  • [9] Genomic islands in pathogenic and environmental microorganisms
    Dobrindt, U
    Hochhut, B
    Hentschel, U
    Hacker, J
    [J]. NATURE REVIEWS MICROBIOLOGY, 2004, 2 (05) : 414 - 424
  • [10] Uprooting the tree of life
    Doolittle, WF
    [J]. SCIENTIFIC AMERICAN, 2000, 282 (02) : 90 - 95