Large-scale protein annotation through gene ontology

被引:77
作者
Xie, HQ [1 ]
Wasserman, A
Levine, Z
Novik, A
Grebinskiy, V
Shoshan, A
Mintz, L
机构
[1] Compugen Inc, Jamesburg, NJ 08831 USA
[2] Compugen Ltd, IL-69512 Tel Aviv, Israel
关键词
D O I
10.1101/gr.86902
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent progress in genomic sequencing, computational biology, and ontology development has presented all opportunity to investigate biological systems from a unique perspective, that is, examining genomes and transcriptomes through the multiple and hierarchical structure of Gene Ontology (GO). We report here our development of GO Engine, a computational platform for GO annotation, and analysis of the resultant GO annotations of human proteins. Protein annotation was centered oil sequence homology with GO-annotated proteins and protein domain analysis. Text information analysis and a multiparameter cellular localization predictive too[ were also used to increase the annotation accuracy, and to predict novel annotations. The majority of proteins corresponding to full-length mRNA in GenBank, and the majority of proteins in the NR database (nonredundant database of proteins) were annotated with one or more GO nodes in each of the three GO categories. The annotations of GenBank and SWISS-PROT proteins are available to the public at the GO Consortium web site.
引用
收藏
页码:785 / 794
页数:10
相关论文
共 21 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] The InterPro database, an integrated documentation resource for protein families, domains and functional sites
    Apweiler, R
    Attwood, TK
    Bairoch, A
    Bateman, A
    Birney, E
    Biswas, M
    Bucher, P
    Cerutti, T
    Corpet, F
    Croning, MDR
    Durbin, R
    Falquet, L
    Fleischmann, W
    Gouzy, J
    Hermjakob, H
    Hulo, N
    Jonassen, I
    Kahn, D
    Kanapin, A
    Karavidopoulou, Y
    Lopez, R
    Marx, B
    Mulder, NJ
    Oinn, TM
    Pagni, M
    Servant, F
    Sigrist, CJA
    Zdobnov, EM
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 37 - 40
  • [3] Ashburner M, 2001, GENOME RES, V11, P1425
  • [4] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [5] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [6] Clustering protein sequences-structure prediction by transitive homology
    Bolten, E
    Schliep, A
    Schneckener, S
    Schomburg, D
    Schrader, R
    [J]. BIOINFORMATICS, 2001, 17 (10) : 935 - 941
  • [7] Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO)
    Dwight, SS
    Harris, MA
    Dolinski, K
    Ball, CA
    Binkley, G
    Christie, KR
    Fisk, DG
    Issel-Tarver, L
    Schroeder, M
    Sherlock, G
    Sethuraman, A
    Weng, S
    Botstein, D
    Cherry, JM
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 69 - 72
  • [8] Ancient origin of the Hox gene cluster
    Ferrier, DEK
    Holland, PWH
    [J]. NATURE REVIEWS GENETICS, 2001, 2 (01) : 33 - 38
  • [9] Gelbart WM, 2002, NUCLEIC ACIDS RES, V30, P106
  • [10] The Ensembl genome database project
    Hubbard, T
    Barker, D
    Birney, E
    Cameron, G
    Chen, Y
    Clark, L
    Cox, T
    Cuff, J
    Curwen, V
    Down, T
    Durbin, R
    Eyras, E
    Gilbert, J
    Hammond, M
    Huminiecki, L
    Kasprzyk, A
    Lehvaslaiho, H
    Lijnzaad, P
    Melsopp, C
    Mongin, E
    Pettett, R
    Pocock, M
    Potter, S
    Rust, A
    Schmidt, E
    Searle, S
    Slater, G
    Smith, J
    Spooner, W
    Stabenau, A
    Stalker, J
    Stupka, E
    Ureta-Vidal, A
    Vastrik, I
    Clamp, M
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 38 - 41