Mining of protein-protein interfacial residues from massive protein sequential and spatial data

被引:5
作者
Wang, Debby D. [1 ]
Zhou, Weiqiang [1 ]
Yan, Hong [1 ]
机构
[1] City Univ Hong Kong, Dept Elect Engn, Kowloon, Hong Kong, Peoples R China
关键词
Protein-protein interface prediction; 3D alpha shape modeling; Residue sequence profile; Joint mutual information (JMI); Neuro-fuzzy classifiers (NFCs); Neighborhood classifiers (NECs); CART; Extreme learning machines (ELMs); Naive Bayesian classifiers (NBCs); BIG DATA; INTERACTION SITES; DATA-BANK; INFORMATION; PREDICTION; NETWORK;
D O I
10.1016/j.fss.2014.01.017
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
It is a great challenge to process big data in bioinformatics. In this paper, we addressed the problem of identifying protein-protein interfacial residues from massive protein structural data. A protein set, comprising 154993 residues, was analyzed. We applied the three-dimensional alpha shape modeling to the search of surface and interfacial residues in this set, and adopted the spatially neighboring residue profiles to characterize each residue. These residue profiles, which revealed the sequential and spatial information of proteins, translated the original data into a large matrix. After vertically and horizontally refining this matrix, we comparably implemented a series of popular learning procedures, including neuro-fuzzy classifiers (NFCs), CART, neighborhood classifiers (NECs), extreme learning machines (ELMs) and naive Bayesian classifiers (NBCs), to predict the interfacial residues, aiming to investigate the sensitivity of these massive structural data to different learning mechanisms. As a consequence, ELMs, CART and NFCs performed better in terms of computational costs; NFCs, NBCs and ELMs provided favorable prediction accuracies. Overall, NFCs, NBCs and ELMs are favourable choices for fastly and accurately handling this type of data. More importantly, the marginal differences between the prediction performances of these methods imply the insensitivity of this type of data to different learning mechanisms. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:101 / 116
页数:16
相关论文
共 50 条
  • [21] Inferring the microscopic surface energy of protein-protein interfaces from mutation data
    Moal, Iain H.
    Dapkunas, Justas
    Fernandez-Recio, Juan
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2015, 83 (04) : 640 - 650
  • [22] Interface residues of transient protein-protein complexes have extensive intra-protein interactions apart from inter-protein interactions
    Jayashree, Srinivasan
    Murugavel, Pavalam
    Sowdhamini, Ramanathan
    Srinivasan, Narayanaswamy
    BIOLOGY DIRECT, 2019, 14 (1)
  • [23] How reliable are experimental protein-protein interaction data?
    Sprinzak, E
    Sattath, S
    Margalit, H
    JOURNAL OF MOLECULAR BIOLOGY, 2003, 327 (05) : 919 - 923
  • [24] A Parallel and Distributed Computing System for Protein-Protein Interaction Literature Mining
    Lee, Hsi-Chieh
    Huang, Szu-Wei
    CURRENT PROTEOMICS, 2018, 15 (05) : 344 - 351
  • [25] Learning an enriched representation from unlabeled data for protein-protein interaction extraction
    Li, Yanpeng
    Hu, Xiaohua
    Lin, Hongfei
    Yang, Zhihao
    BMC BIOINFORMATICS, 2010, 11
  • [26] Iteratively Predict Protein Functions from Protein-Protein Interactions
    Chi, Xiaoxiao
    Hou, Jingyu
    INFORMATION TECHNOLOGY AND AGRICULTURAL ENGINEERING, 2012, 134 : 771 - 778
  • [27] Prediction Protein-Protein Interactions with LSTM
    Tao, Zheng
    Yao, Jiahao
    Yuan, Chao
    Zhao, Ning
    Yang, Bin
    Chen, Baitong
    Bao, Wenzheng
    SIMULATION TOOLS AND TECHNIQUES, SIMUTOOLS 2021, 2022, 424 : 540 - 545
  • [28] Mining Minimal Motif Pair Sets Maximally Covering Interactions in a Protein-Protein Interaction Network
    Boyen, Peter
    Neven, Frank
    van Dyck, Dries
    Valentim, Felipe L.
    van Dijk, Aalt D. J.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2013, 10 (01) : 73 - 86
  • [29] Towards Data Analytics of Pathogen-Host Protein-Protein Interaction: A survey
    Chen, Huaming
    Shen, Jun
    Wang, Lei
    Song, Jiangning
    2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2016, 2016, : 377 - 388
  • [30] Galaxy InteractoMIX: An Integrated Computational Platform for the Study of Protein-Protein Interaction Data
    Mirela-Bota, Patricia
    Aguirre-Plans, Joaquim
    Meseguer, Alberto
    Galletti, Cristiano
    Segura, Joan
    Planas-Iglesias, Joan
    Garcia-Garcia, Javi
    Guney, Emre
    Oliva, Baldo
    Fernandez-Fuentes, Narcis
    JOURNAL OF MOLECULAR BIOLOGY, 2021, 433 (11)