Mining of protein-protein interfacial residues from massive protein sequential and spatial data

被引:5
作者
Wang, Debby D. [1 ]
Zhou, Weiqiang [1 ]
Yan, Hong [1 ]
机构
[1] City Univ Hong Kong, Dept Elect Engn, Kowloon, Hong Kong, Peoples R China
关键词
Protein-protein interface prediction; 3D alpha shape modeling; Residue sequence profile; Joint mutual information (JMI); Neuro-fuzzy classifiers (NFCs); Neighborhood classifiers (NECs); CART; Extreme learning machines (ELMs); Naive Bayesian classifiers (NBCs); BIG DATA; INTERACTION SITES; DATA-BANK; INFORMATION; PREDICTION; NETWORK;
D O I
10.1016/j.fss.2014.01.017
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
It is a great challenge to process big data in bioinformatics. In this paper, we addressed the problem of identifying protein-protein interfacial residues from massive protein structural data. A protein set, comprising 154993 residues, was analyzed. We applied the three-dimensional alpha shape modeling to the search of surface and interfacial residues in this set, and adopted the spatially neighboring residue profiles to characterize each residue. These residue profiles, which revealed the sequential and spatial information of proteins, translated the original data into a large matrix. After vertically and horizontally refining this matrix, we comparably implemented a series of popular learning procedures, including neuro-fuzzy classifiers (NFCs), CART, neighborhood classifiers (NECs), extreme learning machines (ELMs) and naive Bayesian classifiers (NBCs), to predict the interfacial residues, aiming to investigate the sensitivity of these massive structural data to different learning mechanisms. As a consequence, ELMs, CART and NFCs performed better in terms of computational costs; NFCs, NBCs and ELMs provided favorable prediction accuracies. Overall, NFCs, NBCs and ELMs are favourable choices for fastly and accurately handling this type of data. More importantly, the marginal differences between the prediction performances of these methods imply the insensitivity of this type of data to different learning mechanisms. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:101 / 116
页数:16
相关论文
共 50 条
  • [11] A Coclustering Approach for Mining Large Protein-Protein Interaction Networks
    Pizzuti, Clara
    Rombo, Simona E.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (03) : 717 - 730
  • [12] Protein-protein interaction predictions using text mining methods
    Papanikolaou, Niko Las
    Pavlopoulos, Georgios A.
    Theodosiou, Theodosios
    Iliopoulos, Ioannis
    METHODS, 2015, 74 : 47 - 53
  • [13] DeepRank: a deep learning framework for data mining 3D protein-protein interfaces
    Renaud, Nicolas
    Geng, Cunliang
    Georgievska, Sonja
    Ambrosetti, Francesco
    Ridder, Lars
    Marzella, Dario F.
    Reau, Manon F.
    Bonvin, Alexandre M. J. J.
    Xue, Li C.
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [14] PocketQuery: protein-protein interaction inhibitor starting points from protein-protein interaction structure
    Koes, David Ryan
    Camacho, Carlos J.
    NUCLEIC ACIDS RESEARCH, 2012, 40 (W1) : W387 - W392
  • [15] Sequence and structural analysis of binding site residues in protein-protein complexes
    Gromiha, M. Michael
    Yokota, Kiyonobu
    Fukui, Kazuhiko
    INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2010, 46 (02) : 187 - 192
  • [16] Sequence and Structural features of binding site residues in protein-protein complexes
    Gromiha, M. Michael
    Saranya, N.
    Selvaraj, S.
    Jayaram, B.
    Fukui, Kazuhiko
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2010, : 53 - 56
  • [17] Towards a better understanding of the specificity of protein-protein interaction
    Kysilka, Jiri
    Vondrasek, Jiri
    JOURNAL OF MOLECULAR RECOGNITION, 2012, 25 (11) : 604 - 615
  • [18] An Investigation on Characteristic of Residues Involved in Intrinsically Disordered Protein-Protein Interaction
    Dong Chuan
    Cao Zan-Xia
    Zhao Li-Ling
    Suo Zhen-Peng
    Wang Ji-Hua
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2014, 41 (05) : 462 - 471
  • [19] ProNA2020 predicts protein-DNA, protein-RNA, and protein-protein binding proteins and residues from sequence
    Qiu, Jiajun
    Bernhofer, Michael
    Heinzinger, Michael
    Kemper, Sofie
    Norambuena, Tomas
    Melo, Francisco
    Rost, Burkhard
    JOURNAL OF MOLECULAR BIOLOGY, 2020, 432 (07) : 2428 - 2443
  • [20] Bayesian inference of protein-protein interactions from biological literature
    Chowdhary, Rajesh
    Zhang, Jinfeng
    Liu, Jun S.
    BIOINFORMATICS, 2009, 25 (12) : 1536 - 1542