Mining of protein-protein interfacial residues from massive protein sequential and spatial data

被引：5

作者：

Wang, Debby D. ^{[1
]}

Zhou, Weiqiang ^{[1
]}

Yan, Hong ^{[1
]}

机构：

[1] City Univ Hong Kong, Dept Elect Engn, Kowloon, Hong Kong, Peoples R China

来源：

FUZZY SETS AND SYSTEMS | 2015年 / 258卷

关键词：

Protein-protein interface prediction; 3D alpha shape modeling; Residue sequence profile; Joint mutual information (JMI); Neuro-fuzzy classifiers (NFCs); Neighborhood classifiers (NECs); CART; Extreme learning machines (ELMs); Naive Bayesian classifiers (NBCs); BIG DATA; INTERACTION SITES; DATA-BANK; INFORMATION; PREDICTION; NETWORK;

D O I：

10.1016/j.fss.2014.01.017

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

It is a great challenge to process big data in bioinformatics. In this paper, we addressed the problem of identifying protein-protein interfacial residues from massive protein structural data. A protein set, comprising 154993 residues, was analyzed. We applied the three-dimensional alpha shape modeling to the search of surface and interfacial residues in this set, and adopted the spatially neighboring residue profiles to characterize each residue. These residue profiles, which revealed the sequential and spatial information of proteins, translated the original data into a large matrix. After vertically and horizontally refining this matrix, we comparably implemented a series of popular learning procedures, including neuro-fuzzy classifiers (NFCs), CART, neighborhood classifiers (NECs), extreme learning machines (ELMs) and naive Bayesian classifiers (NBCs), to predict the interfacial residues, aiming to investigate the sensitivity of these massive structural data to different learning mechanisms. As a consequence, ELMs, CART and NFCs performed better in terms of computational costs; NFCs, NBCs and ELMs provided favorable prediction accuracies. Overall, NFCs, NBCs and ELMs are favourable choices for fastly and accurately handling this type of data. More importantly, the marginal differences between the prediction performances of these methods imply the insensitivity of this type of data to different learning mechanisms. (C) 2014 Elsevier B.V. All rights reserved.

引用

页码：101 / 116

页数：16

共 50 条

[1] Interfacial residues in protein-protein complexes are in the eyes of the beholder
Parvathy, Jayadevan
Yazhini, Arangasamy
Srinivasan, Narayanaswamy
Sowdhamini, Ramanathan
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2024, 92 (04) : 509 - 528
[2] Mining from protein-protein interactions
Mamitsuka, Hiroshi
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 2 (05) : 400 - 410
[3] Identifying protein-protein interfacial residues in heterocomplexes using residue conservation scores
Li, Jing-Jing
Huang, De-Shuang
Wang, Bing
Chen, Pen
INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2006, 38 (3-5) : 241 - 247
[4] PPLook: an automated data mining tool for protein-protein interaction
Zhang, Shao-Wu
Li, Yao-Jun
Xia, Li
Pan, Quan
BMC BIOINFORMATICS, 2010, 11
[5] Implication of Terminal Residues at Protein-Protein and Protein-DNA Interfaces
Martin, Olivier M. F.
Etheve, Loic
Launay, Guillaume
Martin, Juliette
PLOS ONE, 2016, 11 (09):
[6] Unique Physicochemical Patterns of Residues in Protein-Protein Interfaces
Lazar, Tamas
Guharoy, Mainak
Schad, Eva
Tompa, Peter
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (10) : 2164 - 2173
[7] Characterization of Protein-Protein Interaction Interfaces from a Single Species
Talavera, David
Robertson, David L.
Lovell, Simon C.
PLOS ONE, 2011, 6 (06):
[8] Scoring by Intermolecular Pairwise Propensities of Exposed Residues (SIPPER): A New Efficient Potential for Protein-Protein Docking
Pons, Carles
Talavera, David
de la Cruz, Xavier
Orozco, Modesto
Fernandez-Recio, Juan
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2011, 51 (02) : 370 - 377
[9] Using data fusion for scoring reliability of protein-protein interactions
Vazifedoost, Alireza
Rahgozar, Maseud
Moshiri, Behzad
Sadeghi, Mehdi
Hon Nian Chua
See Kiong Ng
Wong, Limsoon
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2014, 12 (04)
[10] ProPairs: A Data Set for Protein-Protein Docking
Krull, Florian
Korff, Gerrit
Elghobashi-Meinhardt, Nadia
Knapp, Ernst-Walter
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2015, 55 (07) : 1495 - 1507

← 1 2 3 4 5 →