A High-Performance, Energy-Efficient Modular DMA Engine Architecture

被引:4
作者
Benz, Thomas [1 ]
Rogenmoser, Michael [1 ]
Scheffler, Paul [1 ]
Riedel, Samuel [1 ]
Ottaviano, Alessandro [1 ]
Kurth, Andreas [1 ]
Hoefler, Torsten [2 ]
Benini, Luca [3 ,4 ]
机构
[1] Swiss Fed Inst Technol, Integrated Syst Lab IIS, CH-8092 Zurich, Switzerland
[2] Swiss Fed Inst Technol, Scalable Parallel Comp Lab SPCL, CH-8092 Zurich, Switzerland
[3] Swiss Fed Inst Technol, Integrated Syst Lab IIS, Zurich, Switzerland
[4] Univ Bologna, Dept Elect Elect & Informat Engn DEI, I-40126 Bologna, Italy
关键词
DMA; DMAC; direct memory access; memory systems; high-performance; energy-efficiency; edge AI; AXI; TileLink;
D O I
10.1109/TC.2023.3329930
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data transfers are essential in today's computing systems as latency and complex memory access patterns are increasingly challenging to manage. Direct memory access engines (DMAES) are critically needed to transfer data independently of the processing elements, hiding latency and achieving high throughput even for complex access patterns to high-latency memory. With the prevalence of heterogeneous systems, DMAEs must operate efficiently in increasingly diverse environments. This work proposes a modular and highly configurable open-source DMAE architecture called intelligent DMA (iDMA), split into three parts that can be composed and customized independently. The front-end implements the control plane binding to the surrounding system. The mid-end accelerates complex data transfer patterns such as multi-dimensional transfers, scattering, or gathering. The back-end interfaces with the on-chip communication fabric (data plane). We assess the efficiency of iDMA in various instantiations: In high-performance systems, we achieve speedups of up to 15.8$\boldsymbol{\times}$x with only 1% additional area compared to a base system without a DMAE. We achieve an area reduction of 10% while improving ML inference performance by 23% in ultra-low-energy edge AI systems over an existing DMAE solution. We provide area, timing, latency, and performance characterization to guide its instantiation in various systems.
引用
收藏
页码:263 / 277
页数:15
相关论文
共 50 条
  • [41] Hybrid Convolution Architecture for Energy-Efficient Deep Neural Network Processing
    Kim, Suchang
    Jo, Jihyuck
    Park, In-Cheol
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (05) : 2017 - 2029
  • [42] Evaluation of energy-efficient design strategies: Comparison of the thermal performance of energy-efficient office buildings in composite climate, India
    Bano, Farheen
    Sehgal, Vandana
    SOLAR ENERGY, 2018, 176 : 506 - 519
  • [43] Energy-Efficient Partitioning of Hybrid Caches in Multi-Core Architecture
    Lee, Dongwoo
    Choi, Kiyoung
    2014 22ND INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2014,
  • [44] Hybrid Scratchpad Video Memory Architecture for Energy-Efficient Parallel HEVC
    Sampaio, Felipe M.
    Zatt, Bruno
    Shafique, Muhammad
    Henkel, Jorg
    Bampi, Sergio
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 3046 - 3060
  • [45] Energy-Efficient Multiple Network-on-Chip Architecture With Bandwidth Expansion
    Zhou, Wu
    Ouyang, Yiming
    Xu, Dongyu
    Huang, Zhengfeng
    Liang, Huaguo
    Wen, Xiaoqing
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (04) : 442 - 455
  • [46] Energy-efficient computing-in-memory architecture for AI processor: device, circuit, architecture perspective
    Liang CHANG
    Chenglong LI
    Zhaomin ZHANG
    Jianbiao XIAO
    Qingsong LIU
    Zhen ZHU
    Weihang LI
    Zixuan ZHU
    Siqi YANG
    Jun ZHOU
    ScienceChina(InformationSciences), 2021, 64 (06) : 45 - 59
  • [47] Energy-efficient computing-in-memory architecture for AI processor: device, circuit, architecture perspective
    Liang Chang
    Chenglong Li
    Zhaomin Zhang
    Jianbiao Xiao
    Qingsong Liu
    Zhen Zhu
    Weihang Li
    Zixuan Zhu
    Siqi Yang
    Jun Zhou
    Science China Information Sciences, 2021, 64
  • [48] ELSA: Energy-Efficient Linear Sensor Architecture for Smart City Applications
    Almalki, Khalid J.
    Jabbari, Abdoh
    Ayinala, Kaushik
    Sung, Sanghak
    Choi, Baek-Young
    Song, Sejun
    IEEE SENSORS JOURNAL, 2022, 22 (07) : 7074 - 7083
  • [49] Energy-efficient computing-in-memory architecture for AI processor: device, circuit, architecture perspective
    Chang, Liang
    Li, Chenglong
    Zhang, Zhaomin
    Xiao, Jianbiao
    Liu, Qingsong
    Zhu, Zhen
    Li, Weihang
    Zhu, Zixuan
    Yang, Siqi
    Zhou, Jun
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (06)
  • [50] Survey of Novel Architectures for Energy Efficient High-Performance Mobile Computing Platforms
    O'Connor, Owen
    Elfouly, Tarek
    Alouani, Ali
    ENERGIES, 2023, 16 (16)