Exploring the Tradeoffs of Application-Specific Processing

被引:1
|
作者
Schabel, Joshua C. [1 ]
Franzon, Paul D. [1 ]
机构
[1] North Carolina State Univ, Dept Elect & Comp Engn, Raleigh, NC 27695 USA
关键词
ASIP; SIMD; CGRA; processing-in-memory; processing-near-memory; HTM; sparsey; artificial neural networks; ARCHITECTURE; SPECIALIZATION; DESIGN;
D O I
10.1109/JETCAS.2018.2849939
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Non-traditional processing schemes continue to grow in popularity as a means to achieve high performance with greater energy-efficiency. Data-centric processing is one such scheme that targets functional-specialization and memory bandwidth limitations, opening up small processors to wide memory IO. These functional-specific accelerators prove to be an essential component to achieve energy-efficiency and performance, but purely application-specific integrated circuit accelerators have expensive design overheads with limited reusability. We propose an architecture that combines existing processing schemes utilizing CGRAs for dynamic data path configuration as a means to add flexibility and reusability to data-centric acceleration. While flexibility adds a large energy overhead, performance can be regained through intelligent mappings to the CGRA for the functions of interest, while reusability can he gained through incrementally adding general purpose functionality to the processing elements. Building upon previous work accelerating sparse encoded neural networks, we present a CGRA architecture for mapping functional accelerators operating at 500 MHz in 32 nm. This architecture achieves a latency-per-function within 2x of its function-specific counterparts with energy-per-operation increases between 21-188 x, and energy-per-area increases between 1.8-3.6x.
引用
收藏
页码:531 / 542
页数:12
相关论文
共 50 条
  • [21] Rapid development of application-specific flexible MRI receive coils
    Collick, B. D.
    Behzadnezhad, B.
    Hurley, Samuel A.
    Mathew, N. K.
    Behdad, N.
    Lindsay, S. A.
    Robb, F.
    Stormont, R. S.
    McMillan, A. B.
    PHYSICS IN MEDICINE AND BIOLOGY, 2020, 65 (19):
  • [22] An Application-specific Instruction Set Processor for Power Quality Monitoring
    Vaas, Steffen
    Reichenbach, Marc
    Fey, Dietmar
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 181 - 188
  • [23] A Systematic Approach for Optimized Bypass Configurations for Application-Specific Embedded Processors
    Jungeblut, Thorsten
    Huebener, Boris
    Porrmann, Mario
    Rueckert, Ulrich
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2013, 13 (02)
  • [24] A Link Removal Methodology for Application-Specific Networks-on-Chip on FPGAs
    Wang, Daihan
    Matsutani, Hiroki
    Koibuchi, Michihiro
    Amano, Hideharu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (04): : 575 - 583
  • [25] TailoredCore: Generating Application-Specific RISC-V-based Cores
    Gonzalez-Gomez, Jeferson
    Avila-Ardon, Steven
    Rojas-Gonzalez, Jonathan
    Stephen-Cantillano, Andres
    Castro-Godinez, Jorge
    Salazar-Garcia, Carlos
    Shafique, Muhammad
    Henkel, Joerg
    2021 IEEE 12TH LATIN AMERICA SYMPOSIUM ON CIRCUITS AND SYSTEM (LASCAS), 2021,
  • [26] Parallel Memory Architecture for Application-Specific Instruction-Set Processors
    Pitkanen, Teemu
    Tanskanen, Jarno K.
    Makinen, Risto
    Takala, Jarmo
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2009, 57 (01): : 21 - 32
  • [27] Design of an Application-specific VLIW Vector Processor for ORB Feature Extraction
    Ferreira, Lucas
    Malkowsky, Steffen
    Persson, Patrik
    Karlsson, Sven
    Astrom, Kalle
    Liu, Liang
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2023, 95 (07): : 863 - 875
  • [28] Parallel Memory Architecture for Application-Specific Instruction-Set Processors
    Teemu Pitkänen
    Jarno K. Tanskanen
    Risto Mäkinen
    Jarmo Takala
    Journal of Signal Processing Systems, 2009, 57 : 21 - 32
  • [29] Alternative application-specific processor architectures for fast arbitrary bit permutations
    Shi, Zhijie Jerry
    Yang, Xiao
    Lee, Ruby B.
    INTERNATIONAL JOURNAL OF EMBEDDED SYSTEMS, 2008, 3 (04) : 219 - 228
  • [30] Application-Specific Instruction Set Architecture for an Ultralight Hardware Security Module
    Ayoub, Ahmed A.
    Aagaard, Mark D.
    PROCEEDINGS OF THE 2020 IEEE INTERNATIONAL SYMPOSIUM ON HARDWARE ORIENTED SECURITY AND TRUST (HOST), 2020, : 69 - 79