A Pattern Based Algorithmic Autotuner for Graph Processing on GPUs

被引：26

作者：

Meng, Ke ^{[1
,2
]}

Li, Jiajia ^{[3
]}

Tan, Guangming ^{[1
,2
]}

Sun, Ninghui ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, State Key Lab Comp Architecture, Inst Comp Technol, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

[3] Pacific Northwest Natl Lab, Richland, WA 99352 USA

来源：

PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19) | 2019年

基金：

中国国家自然科学基金;

关键词：

GPU; Graph processing; Auto-tuning; IMPLEMENTATION; DRIVEN; MODEL;

D O I：

10.1145/3293883.3295716

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

This paper proposes GSWITCH, a pattern-based algorithmic auto-tuning system that dynamically switches between optimization variants with negligible overhead. Its novelty lies in a small set of algorithmic patterns that allow for the configurable assembly of variants of the algorithm. The fast transition of GSWITCH is based on a machine learning model trained using 644 real graphs. Moreover, GSWITCH provides a simple programming interface that conceals low-level tuning details from the user. We evaluate GSWITCH on typical graph algorithms (BFS, CC, PR, SSSP, and BC) using Nvidia Kepler and Pascal GPUs. The results show that GSWITCH runs up to 10x faster than the best configuration of the state-ofthe-art programmable GPU-based graph processing libraries on 10 representative graphs. GSWITCH outperforms Gunrock on 92.4% cases of 644 graphs which is the largest dataset evaluation reported to date.

引用

页码：201 / 213

页数：13

共 64 条

[1] Big Graph Analytics Platforms
不详
[J]. FOUNDATIONS AND TRENDS IN DATABASES, 2015, 7 (1-2): : 2 - +
[2] [Anonymous], 2018, COMP OTHER ENGINES
[3] [Anonymous], 2017, NETWORK REPOSITORY
[4] PetaBricks: A Language and Compiler for Algorithmic Choice
Ansel, Jason
Chan, Cy
Wong, Yee Lok
Olszewski, Marek
Zhao, Qin
Edelman, Alan
Amarasinghe, Saman
[J]. PLDI'09 PROCEEDINGS OF THE 2009 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, 2009, : 38 - 49
[5] Ashari Arash, 2015, HIGH PERF COMP NETW, P781
[6] Beamer S., 2012, 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, DOI 10.1109/SC.2012.50
[7] Ben-Nun T, 2017, ACM SIGPLAN NOTICES, V52, P235, DOI [10.1145/3018743.3018756, 10.1145/3155284.3018756]
[8] To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations
Besta, Maciej
Podstawski, Michal
Groner, Linus
Solomonik, Edgar
Hoefler, Torsten
[J]. HPDC'17: PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2017, : 93 - 104
[9] A faster algorithm for betweenness centrality
Brandes, U
[J]. JOURNAL OF MATHEMATICAL SOCIOLOGY, 2001, 25 (02) : 163 - 177
[10] The anatomy of a large-scale hypertextual Web search engine
Brin, S
Page, L
[J]. COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 107 - 117

← 1 2 3 4 5 6 7 →