An Application-Specific VLIW Processor with Vector Instruction Set for CNN Acceleration

被引:0
作者
Bytyn, Andreas [1 ]
Leupers, Rainer [1 ]
Ascheid, Gerd [1 ]
机构
[1] Rhein Westfal TH Aachen, Inst Commun Technol & Embedded Syst, Aachen, Germany
来源
2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2019年
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, neural networks have surpassed classical algorithms in areas such as object recognition, e.g. in the well-known ImageNet challenge. As a result, great effort is being put into developing fast and efficient accelerators, especially for Convolutional Neural Networks (CNNs). In this work we present ConvAix, a fully C-programmable processor, which - contrary to many existing architectures - does not rely on a hard-wired array of multiply-and-accumulate (MAC) units. Instead it maps computations onto independent vector lanes making use of a carefully designed vector instruction set. The presented processor is targeted towards latency-sensitive applications and is capable of executing up to 192 MAC operations per cycle. ConvAix operates at a target clock frequency of 400 MHz in 28nm CMOS, thereby offering state-of-the-art performance with proper flexibility within its target domain. Simulation results for several 2D convolutional layers from well known CNNs (AlexNet, VGG-16) show an average ALU utilization of 72.5% using vector instructions with 16 bit fixedpoint arithmetic. Compared to other well-known designs which are less flexible, ConvAix offers competitive energy efficiency of up to 497 GOP/s/W while even surpassing them in terms of area efficiency and processing speed.
引用
收藏
页数:5
相关论文
共 14 条
[1]  
[Anonymous], 2016 IEEE WINT C APP
[2]  
[Anonymous], P 3 INT C LEARNING R
[3]  
[Anonymous], IEEE J SOLID STATE C
[4]  
[Anonymous], EFFICIENT PROCESSING
[5]  
[Anonymous], IEEE MICRO
[6]  
[Anonymous], IEEE T PARALLEL DIST
[7]  
[Anonymous], FIXED POINT QUANTIZA
[8]  
[Anonymous], IEEE T CIRCUITS SYST
[9]  
[Anonymous], 2017, MOBILENETS EFFICIENT
[10]  
[Anonymous], 2015, CLIN ORTHOP RELAT R