Dyamond: Compact and Efficient 1T1C DRAM IMC Accelerator With Bit Column Addition for Memory-Intensive AI

被引：0

作者：

Hong, Seongyon ^{[1
]}

Jo, Wooyoung ^{[1
]}

Kim, Sangjin ^{[1
]}

Kim, Sangyeob ^{[1
]}

Um, Soyeon ^{[1
]}

Sohn, Kyomin ^{[2
]}

Yoo, Hoi-Jun ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol KAIST, Sch Elect Engn, Daejeon 34141, South Korea

[2] Samsung Elect, FLASH & DRAM Design Team, Memory Div, Hwaseong 18448, South Korea

来源：

IEEE JOURNAL OF SOLID-STATE CIRCUITS | 2025年 / 60卷 / 04期

关键词：

Random access memory; Energy efficiency; Computer architecture; Artificial intelligence; Arrays; Single instruction multiple data; Computational efficiency; System-on-chip; In-memory computing; Accuracy; Artificial intelligence (AI); bit column addition (BCA) dataflow; compact MAC-SIMD (CMS) circuit; dynamic random access memory (DRAM); in-memory computing (IMC);

D O I：

10.1109/JSSC.2025.3538899

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This article proposes Dyamond, a one transistor, one capacitor (1T1C) dynamic random access memory (DRAM) in-memory computing (IMC) accelerator with architecture-to-circuit-level optimizations for high memory density and energy efficiency. The bit column addition (BCA) dataflow introduces output bit-wise accumulation to exploit varying accuracy and energy characteristics across different bit positions. The lower BCA (LBCA) reduces analog-to-digital converter (ADC) operations to enhance energy efficiency with inter-column analog accumulation. The higher BCA (HBCA) improves accuracy through signal enhancement and minimizes energy consumption per ADC readout with signal shift (SS). The design maximizes memory density by dedicating 1T1C cells solely to memory and integrating a compact computation circuit adjacent to the bitline sense amplifier. The memory access power is further reduced with a big-little array structure and a switchable sense amplifier (SWSA), which trades off retention time and energy consumption. Fabricated in 28-nm CMOS, Dyamond integrates 3.54-MB DRAM in a 6.48-mm2 area, achieving 27.2 TOPS/W peak efficiency and outstanding performance in advanced models such as BERT and GPT-2.

引用

页码：1299 / 1310

页数：12

共 2 条

[1] An Energy Efficient Computing-in-Memory Accelerator With 1T2R Cell and Fully Analog Processing for Edge AI Applications
Zhou, Keji
Zhao, Chenyang
Fang, Jinbei
Jiang, Jingwen
Chen, Deyang
Huang, Yujie
Jing, Minge
Han, Jun
Tian, Haidong
Xiong, Xiankui
Liu, Qi
Xue, Xiaoyong
Zeng, Xiaoyang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (08) : 2932 - 2936
[2] A 1T2R1C ReRAM CIM Accelerator With Energy-Efficient Voltage Division and Capacitive Coupling for CNN Acceleration in AI Edge Applications
Chen, Deyang
Guo, Zhiwang
Fang, Jinbei
Zhao, Chenyang
Jiang, Jingwen
Zhou, Keji
Tian, Haidong
Xiong, Xiankui
Xue, Xiaoyong
Zeng, Xiaoyang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (01) : 276 - 280

← 1 →