Application of Genetic Programming (GP) Formalism for Building Disease Predictive Models from Protein-Protein Interactions (PPI) Data

被引:13
|
作者
Vyas, Renu [1 ]
Bapat, Sanket [2 ]
Goel, Purva [3 ]
Karthikeyan, Muthukumarasamy [2 ]
Tambe, Sanjeev S. [3 ]
Kulkarni, Bhaskar D. [3 ]
机构
[1] MIT Sch Bioengn Sci & Res, Pune 411008, Maharashtra, India
[2] Natl Chem Lab, CSIR, DIRC, Pune 411008, Maharashtra, India
[3] Natl Chem Lab, CSIR, Chem Engn & Proc Dev Div, Pune 411008, Maharashtra, India
关键词
Genetic programming; protein-protein interactions; disease; binding energy; machine learning; cancer; symbolic regression; BREAST-CANCER SURVIVABILITY; SELECTION;
D O I
10.1109/TCBB.2016.2621042
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein-protein interactions (PPIs) play a vital role in the biological processes involved in the cell functions and disease pathways. The experimental methods known to predict PPIs require tremendous efforts and the results are often hindered by the presence of a large number of false positives. Herein, we demonstrate the use of a new Genetic Programming (GP) based Symbolic Regression (SR) approach for predicting PPIs related to a disease. In this case study, a dataset consisting of 135 PPI complexes related to cancer was used to construct a generic PPI predicting model with good PPI prediction accuracy and generalization ability. A high correlation coefficient (CC) magnitude of 0.893, and low root mean square error (RMSE), and mean absolute percentage error (MAPE) values of 478.221 and 0.239, respectively, were achieved for both the training and test set outputs. To validate the discriminatory nature of the model, it was applied on a dataset of diabetes complexes where it yielded significantly low CC values. Thus, the GP model developed here serves a dual purpose: (a) a predictor of the binding energy of cancer related PPI complexes, and (b) a classifier for discriminating PPI complexes related to cancer from those of other diseases.
引用
收藏
页码:27 / 37
页数:11
相关论文
共 45 条
  • [1] Inferring strengths of protein-protein interactions from experimental data using linear programming
    Hayashida, Morihiro
    Ueda, Nobuhisa
    Akutsu, Tatsuya
    BIOINFORMATICS, 2003, 19 : II58 - II65
  • [2] Using inductive logic programming for predicting protein-protein interactions from multiple genomic data
    Tran, TN
    Satou, K
    Ho, TB
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005, 2005, 3721 : 321 - 330
  • [3] PRED-PPI: A server for predicting protein-protein interactions based on sequence data with probability assignment
    Guo Y.
    Li M.
    Pu X.
    Li G.
    Guang X.
    Xiong W.
    Li J.
    BMC Research Notes, 3 (1)
  • [4] Uncovering the rules for protein-protein interactions from yeast genomic data
    Wang, Jin
    Li, Chunhe
    Wang, Erkang
    Wang, Xidi
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (10) : 3752 - 3757
  • [5] PPI-MASS: An Interactive Web Server to Identify Protein-Protein Interactions From Mass Spectrometry-Based Proteomics Data
    Gonzalez-Avendano, Mariela
    Zuniga-Almonacid, Simon
    Silva, Ian
    Lavanderos, Boris
    Robinson, Felipe
    Rosales-Rojas, Roberto
    Duran-Verdugo, Fabio
    Gonzalez, Wendy
    Caceres, Monica
    Cerda, Oscar
    Vergara-Jaque, Ariela
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2021, 8
  • [6] Effect of the quality of the interaction data on predicting protein function from protein-protein interactions
    Qing-Shan Ni
    Zheng-Zhi Wang
    Gang-Guo Li
    Guang-Yun Wang
    Ying-Jie Zhao
    Interdisciplinary Sciences: Computational Life Sciences, 2009, 1 : 40 - 45
  • [7] Cerebral Cavernous Malformations: Review of the Genetic and Protein-Protein interactions Resulting in Disease Pathogenesis
    Baranoski, Jacob F.
    Kalani, M. Yashar S.
    Przybylowski, Colin J.
    Zabramski, Joseph M.
    FRONTIERS IN SURGERY, 2016, 3
  • [8] Effect of the Quality of the Interaction Data on Predicting Protein Function from Protein-protein Interactions
    Ni, Qing-Shan
    Wang, Zheng-Zhi
    Li, Gang-Guo
    Wang, Guang-Yun
    Zhao, Ying-Jie
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2009, 1 (01) : 40 - 45
  • [9] A Bayesian networks approach for predicting protein-protein interactions from genomic data
    Jansen, R
    Yu, HY
    Greenbaum, D
    Kluger, Y
    Krogan, NJ
    Chung, SB
    Emili, A
    Snyder, M
    Greenblatt, JF
    Gerstein, M
    SCIENCE, 2003, 302 (5644) : 449 - 453
  • [10] Efficient mining from heterogeneous data sets for predicting protein-protein interactions
    Mamitsuka, H
    14TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2003, : 32 - 36