Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme

被引:43
作者
Chen, Kuan-Hsi [1 ]
Wang, Tsai-Feng [2 ]
Hu, Yuh-Jyh [3 ]
机构
[1] Natl Chiao Tung Univ, Coll Comp Sci, Hsinchu 300, Taiwan
[2] Natl Chiao Tung Univ, Inst Data Sci & Engn, Hsinchu 300, Taiwan
[3] Natl Chiao Tung Univ, Inst Biomed Engn, Coll Comp Sci, Hsinchu 300, Taiwan
关键词
Protein-protein interaction; Stacked generalization; Gene ontology; Network topology; SEMANTIC SIMILARITY MEASURES; GENE ONTOLOGY; SEQUENCES; SCALE; TOOL; RESIDUES; CELL;
D O I
10.1186/s12859-019-2907-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundAlthough various machine learning-based predictors have been developed for estimating protein-protein interactions, their performances vary with dataset and species, and are affected by two primary aspects: choice of learning algorithm, and the representation of protein pairs. To improve the performance of predicting protein-protein interactions, we exploit the synergy of multiple learning algorithms, and utilize the expressiveness of different protein-pair features.ResultsWe developed a stacked generalization scheme that integrates five learning algorithms. We also designed three types of protein-pair features based on the physicochemical properties of amino acids, gene ontology annotations, and interaction network topologies. When tested on 19 published datasets collected from eight species, the proposed approach achieved a significantly higher or comparable overall performance, compared with seven competitive predictors.ConclusionWe introduced an ensemble learning approach for PPI prediction that integrated multiple learning algorithms and different protein-pair representations. The extensive comparisons with other state-of-the-art prediction tools demonstrated the feasibility and superiority of the proposed method.
引用
收藏
页数:17
相关论文
共 57 条
  • [51] STACKED GENERALIZATION
    WOLPERT, DH
    [J]. NEURAL NETWORKS, 1992, 5 (02) : 241 - 259
  • [52] Prediction of functional modules based on comparative genome analysis and Gene Ontology application
    Wu, HW
    Su, ZC
    Mao, FL
    Olman, V
    Xu, Y
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 (09) : 2822 - 2837
  • [53] Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations
    Wu, Xiaomei
    Zhu, Lei
    Guo, Jie
    Zhang, Da-Yong
    Lin, Kui
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 (07) : 2137 - 2150
  • [54] Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis
    You, Zhu-Hong
    Lei, Ying-Ke
    Zhu, Lin
    Xia, Junfeng
    Wang, Bing
    [J]. BMC BIOINFORMATICS, 2013, 14
  • [55] An improved approach to infer protein-protein interaction based on a hierarchical vector space model
    Zhang, Jiongmin
    Jia, Ke
    Jia, Jinmeng
    Qian, Ying
    [J]. BMC BIOINFORMATICS, 2018, 19
  • [56] Predicting co-complexed protein pairs using genomic and proteomic data integration
    Zhang, LV
    Wong, SL
    King, OD
    Roth, FP
    [J]. BMC BIOINFORMATICS, 2004, 5 (1)
  • [57] Global analysis of protein activities using proteome chips
    Zhu, H
    Bilgin, M
    Bangham, R
    Hall, D
    Casamayor, A
    Bertone, P
    Lan, N
    Jansen, R
    Bidlingmaier, S
    Houfek, T
    Mitchell, T
    Miller, P
    Dean, RA
    Gerstein, M
    Snyder, M
    [J]. SCIENCE, 2001, 293 (5537) : 2101 - 2105