Using ensemble methods to deal with imbalanced data in predicting protein-protein interactions

被引:90
|
作者
Zhang, Yongqing [1 ]
Zhang, Danling [1 ]
Mi, Gang [2 ]
Ma, Daichuan [3 ]
Li, Gongbing [1 ]
Guo, Yanzhi [3 ]
Li, Menglong [3 ]
Zhu, Min [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Sichuan Univ, Sch Life Sci, Chengdu 610065, Peoples R China
[3] Sichuan Univ, Coll Chem, Chengdu 610065, Peoples R China
基金
中国国家自然科学基金;
关键词
Protein-protein interaction; Ensemble methods; Imbalanced data; HYDROPHOBICITY;
D O I
10.1016/j.compbiolchem.2011.12.003
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In proteins, the number of interacting pairs is usually much smaller than the number of non-interacting ones. So the imbalanced data problem will arise in the field of protein-protein interactions (PPIs) prediction. In this article, we introduce two ensemble methods to solve the imbalanced data problem. These ensemble methods combine the based-cluster under-sampling technique and the fusion classifiers. And then we evaluate the ensemble methods using a dataset from Database of Interacting Proteins (DIP) with 10-fold cross validation. All the prediction models achieve area under the receiver operating characteristic curve (AUC) value about 95%. Our results show that the ensemble classifiers are quite effective in predicting PPIs; we also gain some valuable conclusions on the performance of ensemble methods for PPIs in imbalanced data. The prediction software and all dataset employed in the work can be obtained for free at http://cic.scu.edu.cn/bioinformatics/Ensemble_PPIs/index.html. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [31] Effect of the quality of the interaction data on predicting protein function from protein-protein interactions
    Qing-Shan Ni
    Zheng-Zhi Wang
    Gang-Guo Li
    Guang-Yun Wang
    Ying-Jie Zhao
    Interdisciplinary Sciences: Computational Life Sciences, 2009, 1 : 40 - 45
  • [32] Effect of the Quality of the Interaction Data on Predicting Protein Function from Protein-protein Interactions
    Ni, Qing-Shan
    Wang, Zheng-Zhi
    Li, Gang-Guo
    Wang, Guang-Yun
    Zhao, Ying-Jie
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2009, 1 (01) : 40 - 45
  • [33] Role of ChatGPT in predicting protein-protein interactions
    Ray, Partha Ratim
    Majumder, Poulami
    CURRENT SCIENCE, 2023, 125 (02): : 114 - 114
  • [34] A Simple Approach for Predicting Protein-Protein Interactions
    Rashid, Mamoon
    Ramasamy, Sumathy
    Raghava, Gajendra P. S.
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2010, 11 (07) : 589 - 600
  • [35] Information assessment on predicting protein-protein interactions
    Nan Lin
    Baolin Wu
    Ronald Jansen
    Mark Gerstein
    Hongyu Zhao
    BMC Bioinformatics, 5
  • [36] The interactome: Predicting the protein-protein interactions in cells
    Dariusz Plewczyński
    Krzysztof Ginalski
    Cellular & Molecular Biology Letters, 2009, 14 : 1 - 22
  • [37] The interactome: Predicting the protein-protein interactions in cells
    Plewczynski, Dariusz
    Ginalski, Krzysztof
    CELLULAR & MOLECULAR BIOLOGY LETTERS, 2009, 14 (01) : 1 - 22
  • [38] Predicting the essentialities of protein-protein interactions in cancer
    Cooper, Lee A. D.
    Moran, Josue D.
    Li, Zenggang
    Du, Yuhong
    Harati, Sahar
    Ivanov, Andrey A.
    Webber, Phillip
    Havel, Jonathan J.
    Johns, Margaret A.
    Fu, Haian
    Moreno, Carlos S.
    CANCER RESEARCH, 2015, 75 (22)
  • [39] Predicting protein-protein interactions by association mining
    Kotlyar, M
    Jurisica, I
    INFORMATION SYSTEMS FRONTIERS, 2006, 8 (01) : 37 - 46
  • [40] Predicting Protein-Protein Interactions by Association Mining
    Information Systems Frontiers, 2006, 8 : 37 - 47