Using ensemble methods to deal with imbalanced data in predicting protein-protein interactions

被引:90
|
作者
Zhang, Yongqing [1 ]
Zhang, Danling [1 ]
Mi, Gang [2 ]
Ma, Daichuan [3 ]
Li, Gongbing [1 ]
Guo, Yanzhi [3 ]
Li, Menglong [3 ]
Zhu, Min [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Sichuan Univ, Sch Life Sci, Chengdu 610065, Peoples R China
[3] Sichuan Univ, Coll Chem, Chengdu 610065, Peoples R China
基金
中国国家自然科学基金;
关键词
Protein-protein interaction; Ensemble methods; Imbalanced data; HYDROPHOBICITY;
D O I
10.1016/j.compbiolchem.2011.12.003
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In proteins, the number of interacting pairs is usually much smaller than the number of non-interacting ones. So the imbalanced data problem will arise in the field of protein-protein interactions (PPIs) prediction. In this article, we introduce two ensemble methods to solve the imbalanced data problem. These ensemble methods combine the based-cluster under-sampling technique and the fusion classifiers. And then we evaluate the ensemble methods using a dataset from Database of Interacting Proteins (DIP) with 10-fold cross validation. All the prediction models achieve area under the receiver operating characteristic curve (AUC) value about 95%. Our results show that the ensemble classifiers are quite effective in predicting PPIs; we also gain some valuable conclusions on the performance of ensemble methods for PPIs in imbalanced data. The prediction software and all dataset employed in the work can be obtained for free at http://cic.scu.edu.cn/bioinformatics/Ensemble_PPIs/index.html. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [31] A deep learning algorithm for predicting protein-protein interactions with nonnegative latent factorization
    Wang, Liwei
    Hu, Lun
    2021 INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SOCIAL INTELLIGENCE (ICCSI), 2021,
  • [32] Inhibitors of protein-protein interactions
    Ockey, DA
    Gadek, TR
    EXPERT OPINION ON THERAPEUTIC PATENTS, 2002, 12 (03) : 393 - 400
  • [33] Application of a green fluorescent fusion protein to study protein-protein interactions by electrophoretic methods
    Kiessig, S
    Reissmann, J
    Rascher, C
    Küllertz, G
    Fischer, A
    Thunecke, F
    ELECTROPHORESIS, 2001, 22 (07) : 1428 - 1435
  • [34] Prediction of protein-protein interactions using evolutionary and structural relationships
    Zaki, Nazar
    WORLD CONGRESS ON ENGINEERING 2008, VOLS I-II, 2008, : 1656 - 1661
  • [35] Prediction of protein-protein interactions using alpha shape modeling
    Zhou, Weiqiang
    Yan, Hong
    Fan, Xiaodan
    Hao, Quan
    2011 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL MODELS FOR LIFE SCIENCES (CMLS-11), 2011, 1371 : 244 - 252
  • [36] Prediction of Protein-protein Interactions Based on Feature Selection and Data Balancing
    Liu, Liang
    Lu, Wen-Cong
    Cai, Yu-Dong
    Feng, Kai-Yan
    Peng, Chunrong
    Zhu, Yubei
    PROTEIN AND PEPTIDE LETTERS, 2013, 20 (03) : 336 - 345
  • [37] Protein-Protein Interactions in Translesion Synthesis
    Dash, Radha Charan
    Hadden, Kyle
    MOLECULES, 2021, 26 (18):
  • [38] High throughput methods to study protein-protein interactions during host-pathogen interactions
    Chandrasekharan, Giridhar
    Unnikrishnan, Meera
    EUROPEAN JOURNAL OF CELL BIOLOGY, 2024, 103 (02)
  • [39] Intercellular protein-protein interactions at synapses
    Yang, Xiaofei
    Hou, Dongmei
    Jiang, Wei
    Zhang, Chen
    PROTEIN & CELL, 2014, 5 (06) : 420 - 444
  • [40] Effect of hydrophobicity on protein-protein interactions
    Chanphai, P.
    Bekale, L.
    Tajmir-Riahi, H. A.
    EUROPEAN POLYMER JOURNAL, 2015, 67 : 224 - 231