TransPimLib: Efficient Transcendental Functions for Processing-in-Memory Systems

被引:5
作者
Item, Maurus [1 ]
Gomez-Luna, Juan [1 ]
Guo, Yuxin [1 ]
Oliveira, Geraldo F. [1 ]
Sadrosadati, Mohammad [1 ]
Mutlu, Onur [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
来源
2023 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS | 2023年
关键词
processing-in-memory; processing-near-memory; transcendental functions; activation functions; machine learning;
D O I
10.1109/ISPASS57527.2023.00031
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Processing-in-memory (PIM) promises to alleviate the data movement bottleneck in modern computing systems. However, current real-world PIM systems have the inherent disadvantage that their hardware is more constrained than in conventional processors (CPU, GPU), due to the difficulty and cost of building processing elements near or inside the memory. As a result, general-purpose PIM architectures support fairly limited instruction sets and struggle to execute complex operations such as transcendental functions and other hard-to-calculate operations (e.g., square root). These operations are particularly important for some modern workloads, e.g., activation functions in machine learning applications. In order to provide support for transcendental (and other hardto-calculate) functions in general-purpose PIM systems, we present TransPimLib, a library that provides CORDIC-based and LUT-based methods for trigonometric functions, hyperbolic functions, exponentiation, logarithm, square root, etc. We develop an implementation of TransPimLib for the UPMEM PIM architecture and perform a thorough evaluation of TransPimLib's methods in terms of performance and accuracy, using microbenchmarks and three full workloads (Blackscholes, Sigmoid, Softmax). We open-source all our code and datasets at https://github.com/CMU-SAFARI/transpimlib.
引用
收藏
页码:235 / 247
页数:13
相关论文
共 147 条
  • [51] Hamdioui S, 2015, DES AUT TEST EUROPE, P1718
  • [52] Han J, 1995, LECT NOTES COMPUT SC, V930, P195
  • [53] Hao Jiangwei, 2019, Journal of Physics: Conference Series, V1325, DOI 10.1088/1742-6596/1325/1/012119
  • [54] Hendrycks D, 2020, Arxiv, DOI arXiv:1606.08415
  • [55] Hosmer DW Jr, 2013, WILEY SER PROBAB ST, P89
  • [56] Hsieh K, 2016, PR IEEE COMP DESIGN, P25, DOI 10.1109/ICCD.2016.7753257
  • [57] Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems
    Hsieh, Kevin
    Ebrahimi, Eiman
    Kim, Gwangsun
    Chatterjee, Niladrish
    O'Connor, Mike
    Vijaykumar, Nandita
    Mutlu, Onur
    Keckler, Stephen W.
    [J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 204 - 216
  • [58] Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators
    Huang, Sitao
    Ankit, Aayush
    Silveira, Plinio
    Antunes, Rodrigo
    Chalamalasetti, Sai Rahul
    El Hajj, Izzat
    Kim, Dong Eun
    Aguiar, Glaucimar
    Bruel, Pedro
    Serebryakov, Sergey
    Xu, Cong
    Li, Can
    Faraboschi, Paolo
    Strachan, John Paul
    Chen, Deming
    Roy, Kaushik
    Hwu, Wen-mei
    Milojicic, Dejan
    [J]. 2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 372 - 377
  • [59] Hybrid Memory Cube Consortium, 2013, HMC SPEC 1 1
  • [60] Hybrid Memory Cube Consortium, 2014, HMC SPEC 2 0