Identifying protein complexes based on an edge weight algorithm and core-attachment structure

被引:19
作者
Wang, Rongquan [1 ,2 ]
Liu, Guixia [1 ,2 ]
Wang, Caixia [3 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
[3] China Foreign Affairs Univ, Sch Int Econ, 24 Zhanlanguan Rd, Beijing 100037, Peoples R China
基金
中国国家自然科学基金;
关键词
Protein complexes; Protein-protein interaction networks; Core-attachment structure; Spurious interactions; Structural similarity; INTERACTION NETWORKS; GENE ONTOLOGY; MODULAR ORGANIZATION; FUNCTIONAL MODULES; DATABASE; INTERACTOME; TOOL;
D O I
10.1186/s12859-019-3007-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Protein complex identification from protein-protein interaction (PPI) networks is crucial for understanding cellular organization principles and functional mechanisms. In recent decades, numerous computational methods have been proposed to identify protein complexes. However, most of the current state-of-the-art studies still have some challenges to resolve, including their high false-positives rates, incapability of identifying overlapping complexes, lack of consideration for the inherent organization within protein complexes, and absence of some biological attachment proteins. Results: In this paper, to overcome these limitations, we present a protein complex identification method based on an edge weight method and core-attachment structure (EWCA) which consists of a complex core and some sparse attachment proteins. First, we propose a new weighting method to assess the reliability of interactions. Second, we identify protein complex cores by using the structural similarity between a seed and its direct neighbors. Third, we introduce a new method to detect attachment proteins that is able to distinguish and identify peripheral proteins and overlapping proteins. Finally, we bind attachment proteins to their corresponding complex cores to form protein complexes and discard redundant protein complexes. The experimental results indicate that EWCA outperforms existing state-of-the-art methods in terms of both accuracy and p-value. Furthermore, EWCA could identify many more protein complexes with statistical significance. Additionally, EWCA could have better balance accuracy and efficiency than some state-of-the-art methods with high accuracy. Conclusions: In summary, EWCA has better performance for protein complex identification by a comprehensive comparison with twelve algorithms in terms of different evaluation metrics. The datasets and software are freely available for academic research at https://github.com/RongquanWang/EWCA.
引用
收藏
页数:20
相关论文
共 76 条
[1]   CFinder:: locating cliques and overlapping modules in biological networks [J].
Adamcsek, B ;
Palla, G ;
Farkas, IJ ;
Derényi, I ;
Vicsek, T .
BIOINFORMATICS, 2006, 22 (08) :1021-1023
[2]   Core and peripheral connectivity based cluster analysis over PPI network [J].
Ahmed, Hasin A. ;
Bhattacharyya, Dhruba K. ;
Kalita, Jugal K. .
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2015, 59 :32-41
[3]   Noise-robust algorithm for identifying functionally associated biclusters from gene expression data [J].
Ahn, Jaegyoon ;
Yoon, Youngmi ;
Park, Sanghyun .
INFORMATION SCIENCES, 2011, 181 (03) :435-449
[4]   Predicting Protein-Protein Interactions Using BiGGER: Case Studies [J].
Almeida, Rui M. ;
Dell'Acqua, Simone ;
Krippahl, Ludwig ;
Moura, Jose J. G. ;
Pauleta, Sofia R. .
MOLECULES, 2016, 21 (08)
[5]   Structure-based assembly of protein complexes in yeast [J].
Aloy, P ;
Böttcher, B ;
Ceulemans, H ;
Leutwein, C ;
Mellwig, C ;
Fischer, S ;
Gavin, AC ;
Bork, P ;
Superti-Furga, G ;
Serrano, L ;
Russell, RB .
SCIENCE, 2004, 303 (5666) :2026-2029
[6]  
Altaf-Ul-Amin M, 2006, J COMPUT AIDED CHEM, V7, P150
[7]  
[Anonymous], 2010, J SYMBOLIC LOGIC
[8]  
[Anonymous], 2017, Briefings in bioinformatics, DOI DOI 10.1093/BIB/BBW066
[9]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[10]   An ensemble framework for clustering protein-protein interaction networks [J].
Asur, Sitaram ;
Ucar, Duygu ;
Parthasarathy, Srinivasan .
BIOINFORMATICS, 2007, 23 (13) :I29-I40