Configurable CNN Accelerator in Speech Processing based on Vector Convolution

被引:2
作者
Hui, Lanqing [1 ]
Cao, Shan [1 ]
Chen, Zhiyong [1 ]
Li, Shan [1 ]
Xu, Shugong [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai, Peoples R China
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA | 2022年
基金
中国国家自然科学基金;
关键词
Accelerator; CNN; speech processing; FPGA implementation; DEEP NEURAL-NETWORKS;
D O I
10.1109/AICAS54282.2022.9869904
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In speech applications, both input feature maps (IFMs) and kernels of neural networks are greatly diverse in shapes and sizes, which poses significant challenges to hardware acceleration. In this paper, a configurable CNN accelerator is introduced to make a good balance between the flexibility and efficiency for various neural network models in speech processing. The vector convolution scheme is first proposed by re-arrangement of IFM rows and weight values in vectors, by which the element convolution is converted into vector operations to break the limit of kernel-centric processing. The structure of vector processing element (VPE) is introduced to fit the continuous scaling down of IFMs with little control overheads, and the architecture of the CNN accelerator is proposed accordingly. FPGA implementation results demonstrate that the throughput is increased by 86% by the proposed architecture compared to state-of-the-art FPGA accelerators for the VGG16 network, while high DSP utilization is guaranteed for both 1D and 2D CNNs with various input sizes.
引用
收藏
页码:146 / 149
页数:4
相关论文
共 13 条
[1]   A Survey of Accelerator Architectures for Deep Neural Networks [J].
Chen, Yiran ;
Xie, Yuan ;
Song, Linghao ;
Chen, Fan ;
Tang, Tianqi .
ENGINEERING, 2020, 6 (03) :264-274
[2]   Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices [J].
Chen, Yu-Hsin ;
Yange, Tien-Ju ;
Emer, Joel S. ;
Sze, Vivienne .
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) :292-308
[3]   RepVGG: Making VGG-style ConvNets Great Again [J].
Ding, Xiaohan ;
Zhang, Xiangyu ;
Ma, Ningning ;
Han, Jungong ;
Ding, Guiguang ;
Sun, Jian .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13728-13737
[4]   Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA [J].
Guo, Kaiyuan ;
Sui, Lingzhi ;
Qiu, Jiantao ;
Yu, Jincheng ;
Wang, Junbin ;
Yao, Song ;
Han, Song ;
Wang, Yu ;
Yang, Huazhong .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (01) :35-47
[5]   Far-Field Automatic Speech Recognition [J].
Haeb-Umbach, Reinhold ;
Heymann, Jahn ;
Drude, Lukas ;
Watanabe, Shinji ;
Delcroix, Marc ;
Nakatani, Tomohiro .
PROCEEDINGS OF THE IEEE, 2021, 109 (02) :124-148
[6]  
Jia H, 2021, INT J INTELLIGENCE S, V11, P57
[7]  
Kim B, 2022, Arxiv, DOI arXiv:2106.04140
[8]   Optimizing the Convolution Operation to Accelerate Deep Neural Networks on FPGA [J].
Ma, Yufei ;
Cao, Yu ;
Vrudhula, Sarma ;
Seo, Jae-sun .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (07) :1354-1367
[9]   A Resource-Efficient Multiplierless Systolic Array Architecture for Convolutions in Deep Networks [J].
Parmar, Yashrajsinh ;
Sridharan, K. .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (02) :370-374
[10]   Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA [J].
Shen, Junzhong ;
Huang, You ;
Wang, Zelong ;
Qiao, Yuran ;
Wen, Mei ;
Zhang, Chunyuan .
PROCEEDINGS OF THE 2018 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'18), 2018, :97-106