A Pattern Based Algorithmic Autotuner for Graph Processing on GPUs

被引:26
作者
Meng, Ke [1 ,2 ]
Li, Jiajia [3 ]
Tan, Guangming [1 ,2 ]
Sun, Ninghui [1 ,2 ]
机构
[1] Chinese Acad Sci, State Key Lab Comp Architecture, Inst Comp Technol, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Pacific Northwest Natl Lab, Richland, WA 99352 USA
来源
PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19) | 2019年
基金
中国国家自然科学基金;
关键词
GPU; Graph processing; Auto-tuning; IMPLEMENTATION; DRIVEN; MODEL;
D O I
10.1145/3293883.3295716
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper proposes GSWITCH, a pattern-based algorithmic auto-tuning system that dynamically switches between optimization variants with negligible overhead. Its novelty lies in a small set of algorithmic patterns that allow for the configurable assembly of variants of the algorithm. The fast transition of GSWITCH is based on a machine learning model trained using 644 real graphs. Moreover, GSWITCH provides a simple programming interface that conceals low-level tuning details from the user. We evaluate GSWITCH on typical graph algorithms (BFS, CC, PR, SSSP, and BC) using Nvidia Kepler and Pascal GPUs. The results show that GSWITCH runs up to 10x faster than the best configuration of the state-ofthe-art programmable GPU-based graph processing libraries on 10 representative graphs. GSWITCH outperforms Gunrock on 92.4% cases of 644 graphs which is the largest dataset evaluation reported to date.
引用
收藏
页码:201 / 213
页数:13
相关论文
共 64 条
  • [1] Big Graph Analytics Platforms
    不详
    [J]. FOUNDATIONS AND TRENDS IN DATABASES, 2015, 7 (1-2): : 2 - +
  • [2] [Anonymous], 2018, COMP OTHER ENGINES
  • [3] [Anonymous], 2017, NETWORK REPOSITORY
  • [4] PetaBricks: A Language and Compiler for Algorithmic Choice
    Ansel, Jason
    Chan, Cy
    Wong, Yee Lok
    Olszewski, Marek
    Zhao, Qin
    Edelman, Alan
    Amarasinghe, Saman
    [J]. PLDI'09 PROCEEDINGS OF THE 2009 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, 2009, : 38 - 49
  • [5] Ashari Arash, 2015, HIGH PERF COMP NETW, P781
  • [6] Beamer S., 2012, 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, DOI 10.1109/SC.2012.50
  • [7] Ben-Nun T, 2017, ACM SIGPLAN NOTICES, V52, P235, DOI [10.1145/3018743.3018756, 10.1145/3155284.3018756]
  • [8] To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations
    Besta, Maciej
    Podstawski, Michal
    Groner, Linus
    Solomonik, Edgar
    Hoefler, Torsten
    [J]. HPDC'17: PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2017, : 93 - 104
  • [9] A faster algorithm for betweenness centrality
    Brandes, U
    [J]. JOURNAL OF MATHEMATICAL SOCIOLOGY, 2001, 25 (02) : 163 - 177
  • [10] The anatomy of a large-scale hypertextual Web search engine
    Brin, S
    Page, L
    [J]. COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 107 - 117