Community Detection using Semi-supervised Learning with Graph Convolutional Network on GPUs

被引:18
作者
Sattar, Naw Safrin [1 ]
Arifuzzaman, Shaikh [1 ]
机构
[1] Univ New Orleans, Dept Comp Sci, New Orleans, LA 70148 USA
来源
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2020年
关键词
graph convolutional network; community detection; graph problems; semi-supervised learning; machine learning; deep learning; neural network; optimization; GPU;
D O I
10.1109/BigData50022.2020.9378123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph Convolutional Network (GCN) has drawn considerable research attention in recent times. Many different problems from diverse domains can be solved efficiently using GCN. Community detection in graphs is a computationally challenging graph analytic problem. The presence of only a limited amount of labelled data (known communities) motivates us for using a learning approach to community discovery. However, detecting communities in large graphs using semi-supervised learning with GCN is still an open problem due to the scalability and accuracy issues. In this paper, we present a scalable method for detecting communities based on GCN via semi-supervised node classification. We optimize the hyper-parameters for our semi-supervised model for detecting communities using PyTorch with CUDA on GPU environment. We apply Mini-batch Gradient Descent for larger datasets to resolve the memory issue. We demonstrate an experimental evaluation on different real-world networks from diverse domains. Our model achieves up to 86.9% accuracy and 0.85 F1 Score on these practical datasets. We also show that using identity matrix as features, based on the graph connectivity, performs better with higher accuracy than that of vertex-based graph features. We accelerate the model performance 4 times with the use of GPUs over CPUs.
引用
收藏
页码:5237 / 5246
页数:10
相关论文
共 36 条
[1]  
[Anonymous], 2017, AUTOMATIC DIFFERENTI
[2]   Fast Parallel Algorithms for Counting and Listing Triangles in Big Graphs [J].
Arifuzzaman, Shaikh ;
Khan, Maleq ;
Marathe, Madhav .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (01)
[3]  
Arifuzzaman Shaikh, 2019, International Journal of Big Data Intelligence (IJBDI), V6, P176
[4]  
Bastian M., 2009, Association for the Advancement of Artificial Intelligence, DOI DOI 10.1609/ICWSM.V3I1.13937
[5]  
Bhowmick S., 2013, Dynamics on and of complex networks, V2, P111
[6]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[7]   On modularity clustering [J].
Brandes, Ulrik ;
Delling, Daniel ;
Gaertler, Marco ;
Goerke, Robert ;
Hoefer, Martin ;
Nikoloski, Zoran ;
Wagner, Dorothea .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (02) :172-188
[8]  
Buitinck L, 2013, ECML PKDD WORKSH LAN, P108
[9]  
Chen Z., 2017, Supervised community detection with line graph neural networks
[10]   Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks [J].
Chiang, Wei-Lin ;
Liu, Xuanqing ;
Si, Si ;
Li, Yang ;
Bengio, Samy ;
Hsieh, Cho-Jui .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :257-266