An Integrative Method for Identifying the Over-Annotated Protein-Coding Genes in Microbial Genomes

被引:16
作者
Yu, Jia-Feng [1 ,2 ]
Xiao, Ke [1 ]
Jiang, Dong-Ke [1 ]
Guo, Jing [1 ]
Wang, Ji-Hua [2 ]
Sun, Xiao [1 ]
机构
[1] Southeast Univ, Sch Biol Sci & Med Engn, State Key Lab Bioelect, Nanjing 210096, Jiangsu, Peoples R China
[2] Dezhou Univ, Dept Phys, Shandong Prov Key Lab Biophys Funct Macromol, Dezhou 253023, Peoples R China
基金
中国国家自然科学基金;
关键词
protein-coding gene; microbial genome; re-annotation; horizontal gene transfer; HORIZONTALLY TRANSFERRED GENES; RE-ANNOTATION; GRAPHICAL REPRESENTATION; ESCHERICHIA-COLI; DNA-SEQUENCE; CODON USAGE; BACTERIAL; ORFS; IDENTIFICATION; REANNOTATION;
D O I
10.1093/dnares/dsr030
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The falsely annotated protein-coding genes have been deemed one of the major causes accounting for the annotating errors in public databases. Although many filtering approaches have been designed for the over-annotated protein-coding genes, some are questionable due to the resultant increase in false negative. Furthermore, there is no webserver or software specifically devised for the problem of over-annotation. In this study, we propose an integrative algorithm for detecting the over-annotated protein-coding genes in microorganisms. Overall, an average accuracy of 99.94% is achieved over 61 microbial genomes. The extremely high accuracy indicates that the presented algorithm is efficient to differentiate the protein-coding genes from the non-coding open reading frames. Abundant analyses show that the predicting results are reliable and the integrative algorithm is robust and convenient. Our analysis also indicates that the over-annotated protein-coding genes can cause the false positive of horizontal gene transfers detection. The webserver of the proposed algorithm can be freely accessible from www.cbi.seu.edu.cn/RPGM.
引用
收藏
页码:435 / 449
页数:15
相关论文
共 61 条
  • [1] [Anonymous], GENOME BIOL
  • [2] [Anonymous], GENOME BIOL
  • [3] [Anonymous], BMC GENOMICS
  • [4] [Anonymous], IN SILICO BIOL
  • [5] [Anonymous], NUCL ACIDS RES
  • [6] [Anonymous], CURR OPIN BIOTECH
  • [7] Detecting laterally transferred genes: use of entropic clustering methods and genome position
    Azad, Rajeev K.
    Lawrence, Jeffrey G.
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 (14) : 4629 - 4639
  • [8] Evaluation of Three Automated Genome Annotations for Halorhabdus utahensis
    Bakke, Peter
    Carney, Nick
    DeLoache, Will
    Gearing, Mary
    Ingvorsen, Kjeld
    Lotz, Matt
    McNair, Jay
    Penumetcha, Pallavi
    Simpson, Samantha
    Voss, Laura
    Win, Max
    Heyer, Laurie J.
    Campbell, A. Malcolm
    [J]. PLOS ONE, 2009, 4 (07):
  • [9] Re-annotation of genome microbial CoDing-Sequences:: finding new genes and inaccurately annotated genes -: art. no. 5
    Bocs, S
    Danchin, A
    Médigue, C
    [J]. BMC BIOINFORMATICS, 2002, 3 (1)
  • [10] Errors in genome annotation
    Brenner, SE
    [J]. TRENDS IN GENETICS, 1999, 15 (04) : 132 - 133