A Hardware Instruction Generation Mechanism for Energy-Efficient Computational Memories

被引:0
作者
De La Fuente, Leo [1 ]
Christmann, Jean-Frederic [1 ]
Pezzin, Manuel [1 ]
Remars, Matthias [1 ]
Sentieys, Olivier [2 ]
机构
[1] Univ Grenoble Alpes, CEA, List, F-38000 Grenoble, France
[2] Univ Rennes, Inria, Rennes, France
来源
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年
关键词
near-memory computing; macro-instruction; matrix multiplication; GeMM; embedded systems;
D O I
10.1109/ISCAS58744.2024.10557870
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In the Computing-In-Memory (CIM) approach, computations are directly performed within the data storage unit, which often results in energy reduction. This makes it particularly well fitted for embedded systems, highly constrained in energy efficiency. It is commonly admitted that this energy reduction comes from less data transfers between the CPU and the main memory. Nevertheless, preparing and sending instructions to the computational memory also consumes energy and time, hence limiting overall performance. In this paper, we present a hardware instruction generation mechanism integrated in computational memories and evaluate its benefit for Integer General Matrix Multiplication (IGeMM) operations. The proposed mechanism is implemented in the computational memory controller and translates macro-instructions into corresponding micro-instructions needed to execute the kernel on stored data. We modified an existing near-memory computing architecture and extracted corresponding energy consumption figures using post-layout simulations for the complete SoC. Our proposed architecture, NEar memory computing Macro-Instruction Kernel Accelerator (NeMIKA), provides an 8.2x speed-up and a 4.6x energy consumption reduction compared to a state-of-the-art CIM accelerator based on micro-instructions, while inducing an area overhead of only 0.1%.
引用
收藏
页数:5
相关论文
共 17 条
[1]   Brain-Inspired Hyperdimensional Computing for Ultra-Efficient Edge AI [J].
Amrouch, Hussam ;
Imani, Mohsen ;
Jiao, Xun ;
Aloimonos, Yiannis ;
Fermuller, Cornelia ;
Yuan, Dehao ;
Ma, Dongning ;
Barkam, Hamza E. ;
Genssler, Paul R. ;
Sutor, Peter .
2022 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2022, :25-34
[2]   CoNDA: Efficient Cache Coherence Support for Near-Data Accelerators [J].
Boroumand, Amirali ;
Ghose, Saugata ;
Patel, Minesh ;
Hassan, Hasan ;
Lucia, Brandon ;
Ausavarungnirun, Rachata ;
Hsieh, Kevin ;
Hajinazar, Nastaran ;
Malladi, Krishna T. ;
Zheng, Hongzhong ;
Mutlu, Onur .
PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, :629-642
[3]   Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems [J].
Choong, Benjamin Chen Ming ;
Luo, Tao ;
Liu, Cheng ;
He, Bingsheng ;
Zhang, Wei ;
Zhou, Joey Tianyi .
JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 128
[4]  
Dobberpuhl D. W., 1986, j-DEC-TECH-J, V1, P12
[5]  
Draper J., 2002, Conference Proceedings of the 2002 International Conference on SUPERCOMPUTING, P14, DOI 10.1145/514191.514197
[6]  
Eggermann G. A., 2023, A 16-bit Floating-Point Near-SRAM Architecture for Low-power Sparse Matrix-Vector Multiplication, P6
[7]   Near-Threshold RISC-VCore With DSP Extensions for Scalable IoT Endpoint Devices [J].
Gautschi, Michael ;
Schiavone, Pasquale Davide ;
Traber, Andreas ;
Loi, Igor ;
Pullini, Antonio ;
Rossi, Davide ;
Flamand, Eric ;
Gurkaynak, Frank K. ;
Benini, Luca .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (10) :2700-2713
[8]   A Survey on Memory-centric Computer Architectures [J].
Gebregiorgis, Anteneh ;
Hoang Anh Du Nguyen ;
Yu, Jintao ;
Bishnoi, Rajendra ;
Taouil, Mottaqiallah ;
Catthoor, Francky ;
Hamdioui, Said .
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (04)
[9]  
Hollingsworth W., 1987, Tech. Rep.
[10]   MagCiM: A Flexible and Non-Volatile Computing-in-Memory Processor for Energy-Efficient Logic Computation [J].
Jamshidi, Vahid ;
Patooghy, Ahmad ;
Fazeli, Mahdi .
IEEE ACCESS, 2022, 10 :35445-35459