Spectral clustering on protein-protein interaction networks via constructing affinity matrix using attributed graph embedding

被引:84
作者
Berahmand, Kamal [1 ]
Nasiri, Elahe [2 ]
Mohammadiani, Rojiar Pir [3 ]
Li, Yuefeng [1 ]
机构
[1] Queensland Univ Technol QUT, Sci & Engn Fac, Sch Comp Sci, Brisbane, Qld, Australia
[2] Azarbaijan Shahid Madani Univ, Dept Informat Technol & Commun, Tabriz, Iran
[3] Univ Kurdistan, Dept Comp Engn, Sanandaj, Iran
关键词
Protein-protein interaction network; Protein complexes identification; Spectral clustering; Graph embedding; Affinity matrix; COMMUNITY DETECTION; COMPLEXES; INFORMATION;
D O I
10.1016/j.compbiomed.2021.104933
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The identification of protein complexes in protein-protein interaction networks is the most fundamental and essential problem for revealing the underlying mechanism of biological processes. However, most existing protein complexes identification methods only consider a network's topology structures, and in doing so, these methods miss the advantage of using nodes' feature information. In protein-protein interaction, both topological structure and node features are essential ingredients for protein complexes. The spectral clustering method utilizes the eigenvalues of the affinity matrix of the data to map to a low-dimensional space. It has attracted much attention in recent years as one of the most efficient algorithms in the subcategory of dimensionality reduction. In this paper, a new version of spectral clustering, named text-associated DeepWalk-Spectral Clustering (TADWSC), is proposed for attributed networks in which the identified protein complexes have structural cohesiveness and attribute homogeneity. Since the performance of spectral clustering heavily depends on the effectiveness of the affinity matrix, our proposed method will use the text-associated DeepWalk (TADW) to calculate the embedding vectors of proteins. In the following, the affinity matrix will be computed by utilizing the cosine similarity between the two low dimensional vectors, which will be considerable to improve the accuracy of the affinity matrix. Experimental results show that our method performs unexpectedly well in comparison to existing state-of-the-art methods in both real protein network datasets and synthetic networks.
引用
收藏
页数:9
相关论文
共 63 条
[1]   PSEUDO-LIKELIHOOD METHODS FOR COMMUNITY DETECTION IN LARGE SPARSE NETWORKS [J].
Amini, Arash A. ;
Chen, Aiyou ;
Bickel, Peter J. ;
Levina, Elizaveta .
ANNALS OF STATISTICS, 2013, 41 (04) :2097-2122
[2]   A systematic survey of centrality measures for protein-protein interaction networks [J].
Ashtiani, Minoo ;
Salehzadeh-Yazdi, Ali ;
Razaghi-Moghadam, Zahra ;
Hennig, Holger ;
Wolkenhauer, Olaf ;
Mirzaie, Mehdi ;
Jafari, Mohieddin .
BMC SYSTEMS BIOLOGY, 2018, 12
[3]   An ensemble framework for clustering protein-protein interaction networks [J].
Asur, Sitaram ;
Ucar, Duygu ;
Parthasarathy, Srinivasan .
BIOINFORMATICS, 2007, 23 (13) :I29-I40
[4]   An automated method for finding molecular complexes in large protein interaction networks [J].
Bader, GD ;
Hogue, CW .
BMC BIOINFORMATICS, 2003, 4 (1)
[5]   Community Detection in Complex Networks by Detecting and Expanding Core Nodes Through Extended Local Similarity of Nodes [J].
Berahman, Kamal ;
Bouyer, Asgarali ;
Vasighi, Mandi .
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2018, 5 (04) :1021-1033
[6]  
Berahmand K, 2020, Journal of King Saud University-Computer and Information Sciences
[7]   A modified DeepWalk method for link prediction in attributed social network [J].
Berahmand, Kamal ;
Nasiri, Elahe ;
Rostami, Mehrdad ;
Forouzandeh, Saman .
COMPUTING, 2021, 103 (10) :2227-2249
[8]   LP-LPA: A link influence-based label propagation algorithm for discovering community structures in networks [J].
Berahmand, Kamal ;
Bouyer, Asgarali .
INTERNATIONAL JOURNAL OF MODERN PHYSICS B, 2018, 32 (06)
[9]   Clustering and Summarizing Protein-Protein Interaction Networks: A Survey [J].
Bhowmick, Sourav S. ;
Seah, Boon Siew .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (03) :638-658
[10]  
Chua H. N., 2007, Computational Systems Bioinformatics, V6, P97