iPIM: Programmable In-Memory Image Processing Accelerator Using Near-Bank Architecture

被引:44
|
作者
Gu, Peng [1 ]
Xie, Xinfeng [1 ]
Ding, Yufei [2 ]
Chen, Guoyang [3 ]
Zhang, Weifeng [3 ]
Niu, Dimin [4 ]
Xie, Yuan [1 ,4 ]
机构
[1] UCSB, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
[2] UCSB, Dept Comp Sci, Santa Barbara, CA USA
[3] Alibaba Cloud Infrastruct, Sunnyvale, CA USA
[4] Alibaba DAMO Acad, Sunnyvale, CA USA
来源
2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020) | 2020年
基金
美国国家科学基金会;
关键词
Process-in-memory; Image Processing; Accelerator; LANGUAGE; COMPILER;
D O I
10.1109/ISCA45697.2020.00071
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Image processing is becoming an increasingly important domain for many applications on workstations and the datacenter that require accelerators for high performance and energy efficiency. GPU, which is the state-of-the-art accelerator for image processing, suffers from the memory bandwidth bottleneck. To tackle this bottleneck, near-bank architecture provides a promising solution due to its enormous bank-internal bandwidth and low-energy memory access. However, previous work lacks hardware programmability, while image processing workloads contain numerous heterogeneous pipeline stages with diverse computation and memory access patterns. Enabling programmable near-bank architecture with low hardware overhead remains challenging. This work proposes iPIM, the first programmable in-memory image processing accelerator using near-bank architecture. We first design a decoupled control-execution architecture to provide lightweight programmability support. Second, we propose the SIMB (Single-Instruction-Multiple-Bank) ISA to enable flexible control flow and data access. Third, we present an end-to-end compilation flow based on Halide that supports a wide range of image processing applications and maps them to our SIMB ISA. We further develop iPIM-aware compiler optimizations, including register allocation, instruction reordering, and memory order enforcement to improve performance. We evaluate a set of representative image processing applications on iPIM and demonstrate that on average iPIM obtains 11.02x acceleration and 79.49% energy saving over an NVIDIA Tesla V100 GPU. Further analysis shows that our compiler optimizations contribute 3.19x speedup over the unoptimized baseline.
引用
收藏
页码:804 / 817
页数:14
相关论文
共 36 条
  • [1] RecPIM: Efficient In-Memory Processing for Personalized Recommendation Inference Using Near-Bank Architecture
    Yang, Weidong
    Yang, Yuqing
    Ji, Shuya
    Jiang, Jianfei
    Jing, Naifeng
    Wang, Qin
    Mao, Zhigang
    Sheng, Weiguang
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (10) : 2854 - 2867
  • [2] PIPArch: Programmable Image Processing Architecture Using Sliding Array
    Wu, Feiyang
    Song, Zhuoran
    Ke, Jing
    Jiang, Li
    Jing, Naifeng
    Liang, Xiaoyao
    19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 73 - 80
  • [3] PIMCA: A Programmable In-Memory Computing Accelerator for Energy-Efficient DNN Inference
    Zhang, Bo
    Yin, Shihui
    Kim, Minkyu
    Saikia, Jyotishman
    Kwon, Soonwan
    Myung, Sungmeen
    Kim, Hyunsoo
    Kim, Sang Joon
    Seo, Jae-Sun
    Seok, Mingoo
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2023, 58 (05) : 1436 - 1449
  • [4] A Many-core Architecture for In-Memory Data Processing
    Agrawal, Sandeep R.
    Idicula, Sam
    Raghavan, Arun
    Vlachos, Evangelos
    Govindaraju, Venkatraman
    Varadarajan, Venkatanathan
    Balkesen, Cagri
    Giannikis, Georgios
    Roth, Charlie
    Agarwal, Nipun
    Sedlar, Eric
    50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, : 245 - 258
  • [5] Efficient memory architecture for image processing
    Perri, Stefania
    Corsonello, Pasquale
    INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, 2011, 39 (03) : 351 - 356
  • [6] Approximate In-Memory Computing using Memristive IMPLY Logic and its Application to Image Processing
    Fatemieh, Seyed Erfan
    Reshadinezhad, Mohammad Reza
    TaheriNejad, Nima
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 3115 - 3119
  • [7] SongC: A Compiler for Hybrid Near-Memory and In-Memory Many-Core Architecture
    Lin, Junfeng
    Qu, Huanyu
    Ma, Songchen
    Ji, Xinglong
    Li, Hongyi
    Li, Xiaochuan
    Song, Chenhang
    Zhang, Weihao
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (10) : 2420 - 2433
  • [8] DRAMA: An Architecture for Accelerated Processing Near Memory
    Farmahini-Farahani, Amin
    Ahn, Jung Ho
    Morrow, Katherine
    Kim, Nam Sung
    IEEE COMPUTER ARCHITECTURE LETTERS, 2015, 14 (01) : 26 - 29
  • [9] Bank on Compute-Near-Memory: Design Space Exploration of Processing-Near-Bank Architectures
    Medina, Rafael
    Ansaloni, Giovanni
    Zapater, Marina
    Levisse, Alexandre
    Chamazcoti, Saeideh Alinezhad
    Evenblij, Timon
    Biswas, Dwaipayan
    Catthoor, Francky
    Atienza, David
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (11) : 4117 - 4129
  • [10] A Portable Image Processing Accelerator using FPGA
    Tsiktsiris, Dimitris
    Ziouzios, Dimitris
    Dasygenis, Minas
    2018 7TH INTERNATIONAL CONFERENCE ON MODERN CIRCUITS AND SYSTEMS TECHNOLOGIES (MOCAST), 2018,