Protein annotation from protein interaction networks and Gene Ontology

被引:21
作者
Nguyen, Cao D. [1 ,2 ]
Gardiner, Katheleen J. [3 ,4 ,5 ]
Cios, Krzysztof J. [2 ,6 ]
机构
[1] Univ Western Australia, Med Res Ctr, Nedlands, WA 6009, Australia
[2] Virginia Commonwealth Univ, Richmond, VA 23284 USA
[3] Univ Colorado Denver, Dept Pediat, Intellectual & Dev Disabil Res Ctr, Program Computat Biol, Denver, CO USA
[4] Univ Colorado Denver, Dept Pediat, Intellectual & Dev Disabil Res Ctr, Program Neurosci, Denver, CO USA
[5] Univ Colorado Denver, Dept Pediat, Intellectual & Dev Disabil Res Ctr, Program Human Med Genet, Denver, CO USA
[6] Polish Acad Sci, Inst Theoret & Appl Informat, PL-00901 Warsaw, Poland
基金
美国国家卫生研究院;
关键词
Protein function; Protein interaction networks; Naive Bayes; Association rules; Gene Ontology; INTERACTION MAP; FUNCTION PREDICTION; FUNCTIONAL MODULES; HETEROGENEOUS DATA; GENOME; DATABASE; ALGORITHM;
D O I
10.1016/j.jbi.2011.04.010
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We introduce a novel method for annotating protein function that combines Naive Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precision and 60% recall versus 45% and 26% for Majority and 24% and 61% for chi(2)-statistics, respectively. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:824 / 829
页数:6
相关论文
共 52 条
[1]  
AGRAWAL R, 1993, SIGMOD C, V207, P216
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
[Anonymous], PROGR MACHINE LEARNI
[4]   The IntAct molecular interaction database in 2010 [J].
Aranda, B. ;
Achuthan, P. ;
Alam-Faruque, Y. ;
Armean, I. ;
Bridge, A. ;
Derow, C. ;
Feuermann, M. ;
Ghanbarian, A. T. ;
Kerrien, S. ;
Khadake, J. ;
Kerssemakers, J. ;
Leroy, C. ;
Menden, M. ;
Michaut, M. ;
Montecchi-Palazzi, L. ;
Neuhauser, S. N. ;
Orchard, S. ;
Perreau, V. ;
Roechert, B. ;
van Eijk, K. ;
Hermjakob, H. .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D525-D531
[5]  
ARMSTRONG W, 1974, INFORM PROCESSING, V74
[6]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[7]   Use of logic relationships to decipher protein network organization [J].
Bowers, PM ;
Cokus, SJ ;
Elsenberg, D ;
Yeates, TO .
SCIENCE, 2004, 306 (5705) :2246-2249
[8]  
Brun C, 2004, GENOME BIOL, V5
[9]   A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles [J].
Chin, Chia-Hao ;
Chen, Shu-Hwa ;
Ho, Chin-Wen ;
Ko, Ming-Tat ;
Lin, Chung-Yen .
BMC BIOINFORMATICS, 2010, 11
[10]   Predicting Protein Function by Frequent Functional Association Pattern Mining in Protein Interaction Networks [J].
Cho, Young-Rae ;
Zhang, Aidong .
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2010, 14 (01) :30-36