Near-memory Computing on FPGAs with 3D-stacked Memories: Applications, Architectures, and Optimizations

被引:9
作者
Iskandar, Veronia [1 ]
Abd El Ghany, Mohamed A. [2 ,3 ]
Goehringer, Diana [1 ]
机构
[1] Adapt Dynam Syst, Nothnitzer Str 46, D-01062 Dresden, Germany
[2] German Univ Cairo, Elect Dept, Cairo 11835, Egypt
[3] Tech Univ Darmstadt, Merckstr 25, D-64283 Darmstadt, Germany
关键词
Near-memory computing; 3D stacking; FPGA architectures; high-bandwidth memory; SCALE;
D O I
10.1145/3547658
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The near-memory computing (NMC) paradigm has transpired as a promising method for overcoming the memory wall challenges of future computing architectures. Modern systems integrating 3D-stacked DRAM memory can be leveraged to prevent unnecessary data movement between the main memory and the CPU. FPGA vendors have started introducing 3D memories to their products in an effort to remain competitive on bandwidth requirements of modern memory-intensive applications. Recent NMC proposals target various types of data processing workloads such as graph processing, MapReduce, sorting, machine learning, and database analytics. In this article, we conduct a literature survey on previous proposals of NMC systems on FPGAs integrated with 3D memories. By leveraging the high bandwidth offered from such memories together with specifically designed hardware, FPGA architectures have become a competitor to GPU solutions in terms of speed and energy efficiency. Various FPGA-based NMC designs have been proposed with software and hardware optimization methods to achieve high performance and energy efficiency. Our review investigates various aspects of NMC designs such as platforms, architectures, workloads, and tools. We identify the key challenges and open issues with future research directions.
引用
收藏
页数:32
相关论文
共 111 条
[1]   A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing [J].
Ahn, Junwhan ;
Hong, Sungpack ;
Yoo, Sungjoo ;
Mutlu, Onur ;
Choi, Kiyoung .
2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, :105-117
[2]  
Alian M, 2018, 2018 51ST ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), P802, DOI [10.1109/MICR0.2018.00070, 10.1109/MICRO.2018.00070]
[3]  
[Anonymous], 2017, AXI INT V2 1
[4]  
ARM Developers, 2020, AMBA AXI and ACE Protocol Specification
[5]  
AWS, AWS F1 INST
[6]  
Babarinsa Oreoluwatomiwa O., 2015, P ACM SIGMOD INT C M
[7]  
Baidu, BAID FPGA INST
[8]  
Casper J., 2014, Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays, P151, DOI [10.1145/2554688.2554787, DOI 10.1145/2554688.2554787]
[9]  
Chenhao Liu, 2021, arXiv
[10]  
Choi Young-Kyu, 2021, FPGA, V2021, P116, DOI 10.1145/3431920.3439301