NDMiner: Accelerating Graph Pattern Mining Using Near Data Processing

被引：8

作者：

Talati, Nishil ^{[1
]}

Ye, Haojie ^{[1
]}

Yang, Yichen ^{[1
]}

Belayneh, Leul ^{[1
]}

Chen, Kuan-Yu ^{[1
]}

Blaauw, David ^{[1
]}

Mudge, Trevor ^{[1
]}

Dreslinski, Ronald ^{[1
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

来源：

PROCEEDINGS OF THE 2022 THE 49TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '22) | 2022年

关键词：

Graph pattern mining; near data processing; hardware-software co-design; RAM;

D O I：

10.1145/3470496.3527437

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Graph Pattern Mining (GPM) algorithms mine structural patterns in graphs. The performance of GPM workloads is bottlenecked by control flow and memory stalls. This is because of data-dependent branches used in set intersection and difference operations that dominate the execution time. This paper first conducts a systematic GPM workload analysis and uncovers four new observations to inform the optimization effort. First, GPM workloads mostly fetch inputs of costly set operations from different memory banks. Second, to avoid redundant computation, modern GPM workloads employ symmetry breaking that discards several data reads, resulting in cache pollution and wasted DRAM bandwidth. Third, sparse pattern mining algorithms perform redundant memory reads and computations. Fourth, GPM workloads do not fully utilize the in-DRAM data parallelism. Based on these observations, this paper presents NDMiner, a Near Data Processing (NDP) architecture that improves the performance of GPM workloads. To reduce in-memory data transfer of fetching data from different memory banks, NDMiner integrates compute units to offload set operations in the buffer chip of DRAM. To alleviate the wasted memory bandwidth caused by symmetry breaking, NDMiner integrates a load elision unit in hardware that detects the satisfiability of symmetry breaking constraints and terminates unnecessary loads. To optimize the performance of sparse pattern mining, NDMiner employs compiler optimizations and maps reduced reads and composite computation to NDP hardware that improves algorithmic efficiency of sparse GPM. Finally, NDMiner proposes a new graph remapping scheme in memory and a hardware-based set operation reordering technique to best optimize bank, rank, and channel-level parallelism in DRAM. To orchestrate NDP computation, this paper presents design modifications at the host ISA, compiler, and memory controller. We compare the performance of NDMiner with state-of-the-art software and hardware baselines using a mix of dense and sparse GPM algorithms. Our evaluation shows that NDMiner significantly outperforms software and hardware baselines by 6.4x and 2.5x, on average, while incurring a negligible area overhead on CPU and DRAM.

引用

页码：146 / 159

页数：14

共 50 条

[1] NDSEARCH: Accelerating Graph-Traversal-Based Approximate Nearest Neighbor Search through Near Data Processing
Wang, Yitu
Li, Shiyu
Zheng, Qilin
Song, Linghao
Li, Zongwang
Chang, Andrew
Li, Hai ''Helen''
Chen, Yiran
2024 ACM/IEEE 51ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2024, 2024, : 368 - 381
[2] Survival Prediction from Longitudinal Health Insurance Data using Graph Pattern Mining
Ren, Yongjian
Zhang, Kun
Shi, Yuliang
2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1104 - 1108
[3] Efficient Large Graph Pattern Mining for Big Data in the Cloud
Chen, Chun-Chieh
Lee, Kuan-Wei
Chang, Chih-Chieh
Yang, De-Nian
Chen, Ming-Syan
2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
[4] Casper: Accelerating Stencil Computations Using Near-Cache Processing
Denzler, Alain
Oliveira, Geraldo F.
Hajinazar, Nastaran
Bera, Rahul
Singh, Gagandeep
Gomez-Luna, Juan
Mutlu, Onur
IEEE ACCESS, 2023, 11 : 22136 - 22154
[5] Accelerating Linked-list Traversal Through Near-Data Processing
Hong, Byungchul
Kim, Gwangsun
Ahn, Jung Ho
Kwon, Yongkee
Kim, Hongsik
Kim, John
2016 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION TECHNIQUES (PACT), 2016, : 113 - 124
[6] FlexMiner: A Pattern-Aware Accelerator for Graph Pattern Mining
Chen, Xuhao
Huang, Tianhao
Xu, Shuotao
Bourgeat, Thomas
Chung, Chanwoo
Arvind
2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021), 2021, : 581 - 594
[7] GraNDe: Efficient Near-Data Processing Architecture for Graph Neural Networks
Yun, Sungmin
Nam, Hwayong
Park, Jaehyun
Kim, Byeongho
Ahn, Jung Ho
Lee, Eojin
IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (10) : 2391 - 2404
[8] Efficient Strategies for Graph Pattern Mining Algorithms on GPUs
Ferraz, Samuel
Dias, Vinicius
Teixeira, Carlos H. C.
Teodoro, George
Meira Jr, Wagner
2022 IEEE 34TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2022), 2022, : 110 - 119
[9] A GPU-based Graph Pattern Mining System
Hu, Lin
Zou, Lei
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4867 - 4871
[10] Graph Pattern Mining Paradigms: Consolidation and Renewed Bearing
Dias, Vinicius
Ferraz, Samuel
Vadlamani, Aditya
Erfanian, Mahdi
Teixeira, Carlos H. C.
Guedes, Dorgival
Meira, Wagner, Jr.
Parthasarathy, Srinivasan
2023 IEEE 30TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC 2023, 2023, : 224 - 233

← 1 2 3 4 5 →