Energy-Quality Scalable Design Space Exploration of Approximate FFT Hardware Architectures

被引:11
作者
Pereira, Pedro Taua Lopes [1 ]
da Costa, Patricia Ucker Leleu [1 ]
Ferreira, Guilherme da Costa [1 ]
de Abreu, Brunno Alves [1 ]
Paim, Guilherme [2 ,3 ]
da Costa, Eduardo Antonio Cesar [4 ]
Bampi, Sergio [1 ]
机构
[1] Univ Fed Rio Grande Sul UFRGS, Inst Informat, PGMICRO, BR-91501970 Porto Alegre, RS, Brazil
[2] Univ Fed Rio Grande Sul UFRGS, Inst Informat, PGMICRO, BR-91501970 Porto Alegre, RS, Brazil
[3] Inst Engn Sistemas Computadores Invest & Desenvol, High Performance Comp Architectures & Syst HPCAS, P-1000029 Lisbon, Portugal
[4] Univ Catolica Pelotas UCPel, Grad Program Comput & Elect Engn, BR-96015560 Pelotas, RS, Brazil
关键词
FFT; radix-2; butterflies; approximate adders; HIGH-SPEED; CIRCUITS; MULTIPLIER; ALGORITHM; SYSTEMS; ADDER;
D O I
10.1109/TCSI.2022.3191180
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a comprehensive design space exploration for boosting energy efficiency of a fast Fourier transform (FFT) VLSI accelerator, exploiting several approximate multipliers (AxM) combined with approximate adder (AxA) circuits. The FFT hardware herein presented consists of a fixed-point sequential architecture using a radix-2 butterfly with decimation in time. We explore a set of AxMs - namely Dynamic Range Unbiased (DRUM), Rounding-based Approximate (RoBA), leading one Bit-based Approximate (LoBA), and Truncated approach - jointly with the LOA, ETA-I, Copy(A), Copy(B), Trunc(0), Trunc(1) approximate adders. The approximate arithmetic operators are used in the butterfly kernel with exploration of the approximation levels (for the L and K least-significant bits, respectively, for the AxM and AxA), aiming at discovering the most energy-efficient configuration under a design-time QoR constraint. The mean square error and peak signal-to-noise ratio metrics define which approximate levels combining L and K variations will enable the FFT to process signals to generate spectrograms without significant losses. Our results show that the LoBA multiplier with L=8 together with the LOA, Trunc(1) and Trunc(0), at different approximation levels, provide most energy savings with controllable quality degradation, presenting a minimum decrease of 20.2% in power dissipation without degrading the spectrogram generation quality.
引用
收藏
页码:4524 / 4534
页数:11
相关论文
共 26 条
  • [21] Design of Energy Efficient Multiplier with Approximate Computing on Scalable Compressor for Error-Resilient Image Contrast Enhancement
    Savio, M. Maria Dominic
    Deepa, T.
    WIRELESS PERSONAL COMMUNICATIONS, 2022, 127 (04) : 2997 - 3013
  • [22] Design of Energy Efficient Multiplier with Approximate Computing on Scalable Compressor for Error-Resilient Image Contrast Enhancement
    M. Maria Dominic Savio
    T. Deepa
    Wireless Personal Communications, 2022, 127 : 2997 - 3013
  • [23] Using Multi-objective Design Space Exploration to Enable Run-time Resource Management for Reconfigurable Architectures
    Mariani, Giovanni
    Sima, Vlad-Mihai
    Palermo, Gianluca
    Zaccaria, Vittorio
    Silvano, Cristina
    Bertels, Koen
    DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2012), 2012, : 1379 - 1384
  • [24] Complete design space exploration of isolated hybrid renewable energy system via dynamic programming
    Lee, Kun
    Kum, Dongsuk
    ENERGY CONVERSION AND MANAGEMENT, 2019, 196 : 920 - 934
  • [25] A Cross-Layer Gate-Level-to-Application Co-Simulation for Design Space Exploration of Approximate Circuits in HEVC Video Encoders
    Paim, Guilherme
    Rocha, Leandro Mateus Giacomini
    Amrouch, Hussam
    da Costa, Eduardo Antonio Cesar
    Bampi, Sergio
    Henkel, Jorg
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) : 3814 - 3828
  • [26] Design-Space Exploration and Optimization of an Energy-Efficient and Reliable 3-D Small-World Network-on-Chip
    Das, Sourav
    Doppa, Janardhan Rao
    Pande, Partha Pratim
    Chakrabarty, Krishnendu
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (05) : 719 - 732