Determining the minimum number of protein-protein interactions required to support known protein complexes

被引：9

作者：

Nakajima, Natsu ^{[1
]}

Hayashida, Morihiro ^{[2
]}

Jansson, Jesper ^{[3
]}

Maruyama, Osamu ^{[4
]}

Akutsu, Tatsuya ^{[5
]}

机构：

[1] Univ Tokyo, Inst Mol & Cellular Biosci, Bunkyo Ku, 1-1-1 Yayoi, Tokyo 1130032, Japan

[2] Matsue Coll, Natl Inst Technol, Dept Elect Engn & Comp Sci, 14-4 Nishiikumacho, Matsue, Shimane 6908518, Japan

[3] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China

[4] Kyushu Univ, Inst Math Ind, Nishi Ku, 744 Motooka, Fukuoka 8190395, Japan

[5] Kyoto Univ, Inst Chem Res, Bioinformat Ctr, Uji, Kyoto 6110011, Japan

来源：

PLOS ONE | 2018年 / 13卷 / 04期

关键词：

MOLECULAR INTERACTION DATABASE; INTERACTION NETWORKS; SACCHAROMYCES-CEREVISIAE; CLUSTERING-TREE; YEAST; PREDICTION; DATASETS; INTACT; CANCER;

D O I：

10.1371/journal.pone.0195545

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The prediction of protein complexes from protein-protein interactions (PPIs) is a well-studied problem in bioinformatics. However, the currently available PPI data is not enough to describe all known protein complexes. In this paper, we express the problem of determining the minimum number of (additional) required protein-protein interactions as a graph theoretic problem under the constraint that each complex constitutes a connected component in a PPI network. For this problem, we develop two computational methods: one is based on integer linear programming (ILPMinPPI) and the other one is based on an existing greedytype approximation algorithm (GreedyMinPPI) originally developed in the context of communication and social networks. Since the former method is only applicable to datasets of small size, we apply the latter method to a combination of the CYC2008 protein complex dataset and each of eight PPI datasets (STRING, MINT, BioGRID, IntAct, DIP, BIND, WI-PHI, iRefIndex). The results show that the minimum number of additional required PPIs ranges from 51 (STRING) to 964 (BIND), and that even the four best PPI databases, STRING (51), BioGRID (67), WI-PHI (93) and iRefIndex (85), do not include enough PPIs to form all CYC2008 protein complexes. We also demonstrate that the proposed problem framework and our solutions can enhance the prediction accuracy of existing PPI prediction methods. ILPMinPPI can be freely downloaded from http://sunflower.kuicr.kyoto-u.ac.jp/similar to nakajima/.

引用

页数：17

共 46 条

[1] Structure-based assembly of protein complexes in yeast
Aloy, P
Böttcher, B
Ceulemans, H
Leutwein, C
Mellwig, C
Fischer, S
Gavin, AC
Bork, P
Superti-Furga, G
Serrano, L
Russell, RB
[J]. SCIENCE, 2004, 303 (5666) : 2026 - 2029
[2] Network construction with subgraph connectivity constraints
Angluin, Dana
Aspnes, James
Reyzin, Lev
[J]. JOURNAL OF COMBINATORIAL OPTIMIZATION, 2015, 29 (02) : 418 - 432
[3] The IntAct molecular interaction database in 2010
Aranda, B.
Achuthan, P.
Alam-Faruque, Y.
Armean, I.
Bridge, A.
Derow, C.
Feuermann, M.
Ghanbarian, A. T.
Kerrien, S.
Khadake, J.
Kerssemakers, J.
Leroy, C.
Menden, M.
Michaut, M.
Montecchi-Palazzi, L.
Neuhauser, S. N.
Orchard, S.
Perreau, V.
Roechert, B.
van Eijk, K.
Hermjakob, H.
[J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : D525 - D531
[4] An automated method for finding molecular complexes in large protein interaction networks
Bader, GD
Hogue, CW
[J]. BMC BIOINFORMATICS, 2003, 4 (1)
[5] Bader GD, 2003, NUCLEIC ACIDS RES, V31, P248, DOI 10.1093/nar/gkg056
[6] MINT: the molecular INTeraction database
Chatr-aryamontri, Andrew
Ceol, Arnaud
Palazzi, Luisa Montecchi
Nardelli, Giuliano
Schneider, Maria Victoria
Castagnoli, Luisa
Cesareni, Gianni
[J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D572 - D574
[7] KUPS: constructing datasets of interacting and non-interacting protein pairs with associated attributions
Chen, Xue-wen
Jeong, Jong Cheol
Dermyer, Patrick
[J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D750 - D754
[8] Chockler G, 2007, PODC'07: PROCEEDINGS OF THE 26TH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, P109
[9] Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae
Collins, Sean R.
Kemmeren, Patrick
Zhao, Xue-Chu
Greenblatt, Jack F.
Spencer, Forrest
Holstege, Frank C. P.
Weissman, Jonathan S.
Krogan, Nevan J.
[J]. MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (03) : 439 - 450
[10] An efficient algorithm for large-scale detection of protein families
Enright, AJ
Van Dongen, S
Ouzounis, CA
[J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (07) : 1575 - 1584

← 1 2 3 4 5 →