Improving protein function prediction using domain and protein complexes in PPI networks

被引:40
作者
Peng, Wei [1 ,2 ]
Wang, Jianxin [1 ]
Cai, Juan [1 ]
Chen, Lu [1 ]
Li, Min [1 ]
Wu, Fang-Xiang [1 ,3 ,4 ]
机构
[1] Cent South Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China
[2] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming 650093, Yunnan, Peoples R China
[3] Univ Saskatchewan, Dept Mech Engn, Saskatoon, SK S7N 5A9, Canada
[4] Univ Saskatchewan, Div Biomed Engn, Saskatoon, SK S7N 5A9, Canada
基金
中国国家自然科学基金;
关键词
GENE ONTOLOGY; DATABASE; YEAST; GENERATION; ANNOTATION; ALGORITHM; MODULES;
D O I
10.1186/1752-0509-8-35
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Characterization of unknown proteins through computational approaches is one of the most challenging problems in silico biology, which has attracted world-wide interests and great efforts. There have been some computational methods proposed to address this problem, which are either based on homology mapping or in the context of protein interaction networks. Results: In this paper, two algorithms are proposed by integrating the protein-protein interaction (PPI) network, proteins' domain information and protein complexes. The one is domain combination similarity (DCS), which combines the domain compositions of both proteins and their neighbors. The other is domain combination similarity in context of protein complexes (DSCP), which extends the protein functional similarity definition of DCS by combining the domain compositions of both proteins and the complexes including them. The new algorithms are tested on networks of the model species of Saccharomyces cerevisiae to predict functions of unknown proteins using cross validations. Comparing with other several existing algorithms, the results have demonstrated the effectiveness of our proposed methods in protein function prediction. Furthermore, the algorithm DSCP using experimental determined complex data is robust when a large percentage of the proteins in the network is unknown, and it outperforms DCS and other several existing algorithms. Conclusions: The accuracy of predicting protein function can be improved by integrating the protein-protein interaction (PPI) network, proteins' domain information and protein complexes.
引用
收藏
页数:13
相关论文
共 46 条
  • [41] A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks
    Wang, Jianxin
    Li, Min
    Chen, Jianer
    Pan, Yi
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (03) : 607 - 620
  • [42] Wang JY, 2012, LECT N BIOINFORMAT, V6840, P435
  • [43] A core-attachment based method to detect protein complexes in PPI networks
    Wu, Min
    Li, Xiaoli
    Kwoh, Chee-Keong
    Ng, See-Kiong
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [44] DIP: the Database of Interacting Proteins
    Xenarios, I
    Rice, DW
    Salwinski, L
    Baron, MK
    Marcotte, EM
    Eisenberg, D
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 289 - 291
  • [45] Inferring protein function by domain context similarities in protein-protein interaction networks
    Zhang, Song
    Chen, Hu
    Liu, Ke
    Sun, Zhirong
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [46] Detecting Protein Complexes Based on Uncertain Graph Model
    Zhao, Bihai
    Wang, Jianxin
    Li, Min
    Wu, Fang-Xiang
    Pan, Yi
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (03) : 486 - 497