Dedicated Instruction Set for Pattern-Based Data Transfers: An Experimental Validation on Systems Containing In-Memory Computing Units

被引:0
作者
Mambu, Kevin [1 ]
Charles, Henri-Pierre [1 ]
Kooli, Maha [1 ]
机构
[1] Univ Grenoble Alpes, CEA, LIST, F-38000 Grenoble, France
关键词
Computer architecture; Instruction sets; Random access memory; Programming; Computational modeling; Central Processing Unit; Data transfer; Convolution; in-memory computing (IMC); instruction set architecture (ISA); non-von Neumann; pattern; performance analysis; programming model; stencil; IMAGE; SRAM;
D O I
10.1109/TCAD.2023.3258346
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In memory computing (IMC) aims at solving the performance gap between CPU and memories introduced by the memory wall. However, general-purpose IMC does not consider the optimization of data transfers for patterns, such as stencils and convolutions. This article proposes a new instruction set architecture (ISA) and a novel pattern encoding for IMC to transfer and organize data streams in order to perform efficiently computation. This instruction set is implemented on the data-locality management unit (DMU) as a subset of the computational SRAM (C-SRAM) ISA. A programming model to interact with the DMU at language level is also presented in this article. This DMU ISA is evaluated on six applications run on three different system nodes. These system nodes are based on existing RISC-V cores and range from embedded to high-performance computing domain. Experiments show on average a speed-up of x8.81, an energy reduction factor of x6.81, and an improvement of the number of operations per cycle of x4.59, for the C-SRAM architecture integrating the proposed ISA of the DMU compared to a reference implementation on embedded systems. Results also show an improvement of the number of operations per cycle of x2.99 compared to a reference implementation on all system nodes.
引用
收藏
页码:3757 / 3767
页数:11
相关论文
共 50 条
[1]  
Abadi M., 2016, PREPRINT, DOI DOI 10.48550/ARXIV.1603.04467
[2]   In-Memory Low-Cost Bit-Serial Addition Using Commodity DRAM Technology [J].
Ali, Mustafa E. ;
Jaiswal, Akhilesh ;
Roy, Kaushik .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (01) :155-165
[3]  
[Anonymous], 2001, Basic linear algebra subprograms technical (BLAST) forum standard
[4]  
[Anonymous], 2023, OpenCV: Laplace operator
[5]  
[Anonymous], 2018, J. Open Source Softw., DOI 10.21105/joss.00726
[6]  
[Anonymous], HF105 datasheet," Data Sheet
[7]  
[Anonymous], 2016, The OpenVX specification: Sobel 3x3
[8]   Crossbar-Constrained Technology Mapping for ReRAM Based In-Memory Computing [J].
Bhattacharjee, Debjyoti ;
Tavva, Yaswanth ;
Easwaran, Arvind ;
Chattopadhyay, Anupam .
IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (05) :734-748
[9]  
Bianco M, 2012, Arxiv, DOI arXiv:1207.1746
[10]  
Bichler O., N2D2 (for neural network design deployment)