Towards Sparsification of Graph Neural Networks

被引:13
作者
Peng, Hongwu [1 ]
Gurevin, Deniz [1 ]
Huang, Shaoyi [1 ]
Geng, Tong [2 ]
Jiang, Weiwen [3 ]
Khan, Orner [1 ]
Ding, Caiwen [1 ]
机构
[1] Univ Connecticut, Storrs, CT 06269 USA
[2] Univ Rochester, Rochester, NY 14627 USA
[3] George Mason Univ, Fairfax, VA 22030 USA
来源
2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022) | 2022年
基金
美国国家科学基金会;
关键词
graph; GNN; sparsification; model compression; sparse training; Surrogate Lagrangian Relaxation (SLR);
D O I
10.1109/ICCD56317.2022.00048
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As real-world graphs expand in size, larger GNN models with billions of parameters are deployed. High parameter count in such models makes training and inference on graphs expensive and challenging. To reduce the computational and memory costs of GNNs, optimization methods such as pruning the redundant nodes and edges in input graphs have been commonly adopted. However, model compression, which directly targets the sparsification of model layers, has been mostly limited to traditional Deep Neural Networks (DNNs) used for tasks such as image classification and object detection. In this paper, we utilize two state-of-the-art model compression methods (1) train and prune and (2) sparse training for the sparsification of weight layers in GNNs. We evaluate and compare the efficiency of both methods in terms of accuracy, training sparsity, and training FLOPs on real-world graphs. Our experimental results show that on the ia-email, wiki-talk, and stackoverflow datasets for link prediction, sparse training with much lower training FLOPs achieves a comparable accuracy with the train and prune method. On the brain dataset for node classification, sparse training uses a lower number FLOPs Oess than 1/7 FLOPs of train and prune method) and preserves a much better accuracy performance under extreme model sparsity. Our model sparsification code is publicly available on GitHubl(1).
引用
收藏
页码:272 / 279
页数:8
相关论文
共 29 条
[1]  
[Anonymous], 2017, MICRO 50, DOI DOI 10.1145/3123939.3124552
[2]  
[Anonymous], 2015, AAAI
[3]  
[Anonymous], 2019, ICLR, DOI DOI 10.1007/S10044-019-00792-5
[4]   Molecular generative Graph Neural Networks for Drug Discovery [J].
Bongini, Pietro ;
Bianchini, Monica ;
Scarselli, Franco .
NEUROCOMPUTING, 2021, 450 :242-252
[5]   Distributed optimization and statistical learning via the alternating direction method of multipliers [J].
Boyd S. ;
Parikh N. ;
Chu E. ;
Peleato B. ;
Eckstein J. .
Foundations and Trends in Machine Learning, 2010, 3 (01) :1-122
[6]   Convergence of the Surrogate Lagrangian Relaxation Method [J].
Bragin, Mikhail A. ;
Luh, Peter B. ;
Yan, Joseph H. ;
Yu, Nanpeng ;
Stern, Gary A. .
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2015, 164 (01) :173-201
[7]   DyGNN: Algorithm and Architecture Support of Dynamic Pruning for Graph Neural Networks [J].
Chen, Cen ;
Li, Kenli ;
Zou, Xiaofeng ;
Li, Yangfan .
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, :1201-1206
[8]   Creator Governance in Social Media Entertainment [J].
Cunningham, Stuart ;
Craig, David .
SOCIAL MEDIA + SOCIETY, 2019, 5 (04)
[9]   A Deep Graph Neural Network-Based Mechanism for Social Recommendations [J].
Guo, Zhiwei ;
Wang, Heng .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (04) :2776-2783
[10]  
Hamilton WL, 2017, ADV NEUR IN, V30