Polyhedral-Based Compilation Framework for In-Memory Neural Network Accelerators

被引：4

作者：

Han, Jianhui ^{[1
]}

Fei, Xiang ^{[2
]}

Li, Zhaolin ^{[2
]}

Zhang, Youhui ^{[2
]}

机构：

[1] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Sch Integrated Circuits, 30 Shuangqing Rd, Beijing 100084, Peoples R China

[2] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Dept Comp Sci & Technol, 30 Shuangqing Rd, Beijing 100084, Peoples R China

来源：

ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS | 2022年 / 18卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Polyhedral model; memristor; processing-in-memory; HARDWARE; ENERGY;

D O I：

10.1145/3469847

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Memristor-based processing-in-memory architecture is a promising solution to the memory bottleneck in the neural network (NN) processing. A major challenge for the programmability of such architectures is the automatic compilation of high-level NN workloads, from various operators to the memristor-based hardware that may provide programming interfaces with different granularities. This article proposes a source-to-source compilation framework for such memristor-based NN accelerators, which can conduct automatic detection and mapping of multiple NN operators based on the flexible and rich representation capability of the polyhedral model. In contrast to previous studies, it implements support for pipeline generation to exploit the parallelism in the NN loads to leverage hardware resources for higher efficiency. The evaluation based on synthetic kernels and NN benchmarks demonstrates that the proposed framework can reliably detect and map the target operators. Case studies on typical memristor-based architectures also show its generality over various architectural designs. The evaluation further demonstrates that compared with existing polyhedral-based compilation frameworks that do not support the pipelined execution, the performance can upgrade by an order of magnitude with the pipelined execution, which emphasizes the necessity of our improvement.

引用

页数：23

共 47 条

[31] Pop Sebastian., 2006, GCC Developers Summit, P2006
[32] AtomLayer: A Universal ReRAM-Based CNN Accelerator with Atomic Layer Computation
Qiao, Ximing
Cao, Xiong
Yang, Huanrui
Song, Linghao
Li, Hai
[J]. 2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
[33] ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars
Shafiee, Ali
Nag, Anirban
Muralimanohar, Naveen
Balasubramonian, Rajeev
Strachan, John Paul
Hu, Miao
Williams, R. Stanley
Srikumar, Vivek
[J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 14 - 26
[34] Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[35] PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning
Song, Linghao
Qian, Xuehai
Li, Hai
Chen, Yiran
[J]. 2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, : 541 - 552
[36] Sutskever I, 2014, ADV NEUR IN, V27
[37] Vadivel K, 2020, DES AUT TEST EUROPE, P1602, DOI 10.23919/DATE48585.2020.9116464
[38] Vasilache N, 2018, Arxiv, DOI arXiv:1802.04730
[39] Polyhedral Parallel Code Generation for CUDA
Verdoolaege, Sven
Carlos Juega, Juan
Cohen, Albert
Ignacio Gomez, Jose
Tenllado, Christian
Catthoor, Francky
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 9 (04)
[40] Verdoolaege S, 2010, LECT NOTES COMPUT SC, V6327, P299, DOI 10.1007/978-3-642-15582-6_49

← 1 2 3 4 5 →