Detection of protein complexes from multiple protein interaction networks using graph embedding

被引:14
作者
Liu, Xiaoxia [1 ]
Yang, Zhihao [2 ]
Sang, Shengtian [2 ]
Lin, Hongfei [2 ]
Wang, Jian [2 ]
Xu, Bo [3 ]
机构
[1] Dalian Maritime Univ, Informat Sci & Technol Coll, Dalian 116026, Peoples R China
[2] Dalian Univ Technol, Coll Comp Sci & Technol, Dalian 116024, Peoples R China
[3] Dalian Univ Technol, Sch Software Technol, Dalian 116024, Peoples R China
关键词
Network embedding; Protein complex identification; Protein-protein interaction networks; PPI NETWORK; IDENTIFICATION; FRAMEWORK; ALGORITHM;
D O I
10.1016/j.artmed.2019.04.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cellular processes are typically carried out by protein complexes rather than individual proteins. Identifying protein complexes is one of the keys to understanding principles of cellular organization and function. Also, protein complexes are a group of interacting genes underlying similar diseases, which points out the therapeutic importance of protein complexes. With the development of life science and computing science, an increasing amount of protein-protein interaction (PPI) data becomes available, which makes it possible to predict protein complexes from PPI networks. However, most PPI data produced by high-throughput experiments often has many false positive interactions and false negative edge loss, which makes it difficult to predict complexes accurately. In this paper, we present a new method, named as MEMO (Multiple network Embedding for coMplex detectiOn), to detect protein complexes. MEMO integrates multiple PPI datasets from different species into a single PPI network by using functional orthology information across multiple species and then uses a graph embedding technology to embed protein nodes of the network into continuous vector spaces, so as to quantify the relationships between nodes and better guild the protein complex detection process. Finally, it utilizes a seed-and-extend strategy to identify protein complexes from multiple PPI networks based on the similarities of their corresponding protein representations. As part of our approach, we also define a new quality measure which combines the cluster cohesiveness and cluster density to measure the likelihood of a detected protein complex being a real protein complex. Extensive experimental results demonstrate the proposed method outperforms state-of-the-art complex detection techniques.
引用
收藏
页码:107 / 115
页数:9
相关论文
共 58 条
[1]  
[Anonymous], ARXIV13013781
[2]   An ensemble framework for clustering protein-protein interaction networks [J].
Asur, Sitaram ;
Ucar, Duygu ;
Parthasarathy, Srinivasan .
BIOINFORMATICS, 2007, 23 (13) :I29-I40
[3]   An automated method for finding molecular complexes in large protein interaction networks [J].
Bader, GD ;
Hogue, CW .
BMC BIOINFORMATICS, 2003, 4 (1)
[4]   Evidence for Network Evolution in an Arabidopsis Interactome Map [J].
Braun, Pascal ;
Carvunis, Anne-Ruxandra ;
Charloteaux, Benoit ;
Dreze, Matija ;
Ecker, Joseph R. ;
Hill, David E. ;
Roth, Frederick P. ;
Vidal, Marc ;
Galli, Mary ;
Balumuri, Padmavathi ;
Bautista, Vanessa ;
Chesnut, Jonathan D. ;
Kim, Rosa Cheuk ;
de los Reyes, Chris ;
Gilles, Patrick, II ;
Kim, Christopher J. ;
Matrubutham, Uday ;
Mirchandani, Jyotika ;
Olivares, Eric ;
Patnaik, Suswapna ;
Quan, Rosa ;
Ramaswamy, Gopalakrishna ;
Shinn, Paul ;
Swamilingiah, Geetha M. ;
Wu, Stacy ;
Ecker, Joseph R. ;
Dreze, Matija ;
Byrdsong, Danielle ;
Dricot, Amelie ;
Duarte, Melissa ;
Gebreab, Fana ;
Gutierrez, Bryan J. ;
MacWilliams, Andrew ;
Monachello, Dario ;
Mukhtar, M. Shahid ;
Poulin, Matthew M. ;
Reichert, Patrick ;
Romero, Viviana ;
Tam, Stanley ;
Waaijers, Selma ;
Weiner, Evan M. ;
Vidal, Marc ;
Hill, David E. ;
Braun, Pascal ;
Galli, Mary ;
Carvunis, Anne-Ruxandra ;
Cusick, Michael E. ;
Dreze, Matija ;
Romero, Viviana ;
Roth, Frederick P. .
SCIENCE, 2011, 333 (6042) :601-607
[5]   Interaction network containing conserved and essential protein complexes in Escherichia coli [J].
Butland, G ;
Peregrín-Alvarez, JM ;
Li, J ;
Yang, WH ;
Yang, XC ;
Canadien, V ;
Starostine, A ;
Richards, D ;
Beattie, B ;
Krogan, N ;
Davey, M ;
Parkinson, J ;
Greenblatt, J ;
Emili, A .
NATURE, 2005, 433 (7025) :531-537
[6]   Identification of Protein Complexes from Tandem Affinity Purification/Mass Spectrometry Data via Biased Random Walk [J].
Cai, Bingjing ;
Wang, Haiying ;
Zheng, Huiru ;
Wang, Hui .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (02) :455-466
[7]  
Cao B., 2010, P 27 INT C MACH LEAR, P159
[8]   The BioGRID interaction database: 2015 update [J].
Chatr-aryamontri, Andrew ;
Breitkreutz, Bobby-Joe ;
Oughtred, Rose ;
Boucher, Lorrie ;
Heinicke, Sven ;
Chen, Daici ;
Stark, Chris ;
Breitkreutz, Ashton ;
Kolas, Nadine ;
O'Donnell, Lara ;
Reguly, Teresa ;
Nixon, Julie ;
Ramage, Lindsay ;
Winter, Andrew ;
Sellam, Adnane ;
Chang, Christie ;
Hirschman, Jodi ;
Theesfeld, Chandra ;
Rust, Jennifer ;
Livstone, Michael S. ;
Dolinski, Kara ;
Tyers, Mike .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D470-D478
[9]   Identifying protein complexes and functional modules-from static PPI networks to dynamic PPI networks [J].
Chen, Bolin ;
Fan, Weiwei ;
Liu, Juan ;
Wu, Fang-Xiang .
BRIEFINGS IN BIOINFORMATICS, 2014, 15 (02) :177-194
[10]  
Chowdhury Animesh R., 2016, 2016 IEEE International Conference on Plasma Science (ICOPS), DOI 10.1109/PLASMA.2016.7534285