Hardware and Software Co-design for Soft Switch in ViT Variants Processing Unit

被引:0
作者
Hu, Wei [1 ,2 ]
Fan, Jie [1 ,2 ]
Liu, Fang [3 ,4 ]
Hu, Kejie [1 ,2 ]
机构
[1] Wuhan Univ Sci & Technol, Coll Comp Sci, Wuhan, Hubei, Peoples R China
[2] Hubei Prov Key Lab Intelligent Informat Proc & Re, Wuhan, Hubei, Peoples R China
[3] Wuhan Univ, Sch Comp Sci, Wuhan, Hubei, Peoples R China
[4] Wuhan Inst City, Dept Informat Engn, Wuhan, Hubei, Peoples R China
来源
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2022, PT III | 2022年 / 13370卷
关键词
FPGA; CNN; Transformer; Deep learning; Hardware;
D O I
10.1007/978-3-031-10989-8_55
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the application of pure Transformer in CV field, ViT shows the generality of Transformer model. However, it requires costly training on large datasets. Recently, some researchers trying to improve the training efficiency of ViT by combining ViT and CNN together which will use the inductive bias of CNN. In these models, the MHSA layer carries other modules on its side, but existing architectures cannot take advantage of this feature to customize designs to improve computational efficiency and resource utilization. The use of FPGA to customize specialized computing units can meet this need, but the existing hardware computing units can't adapt to the combination of different types of layers, and switching between different models will result in expensive re-production costs. In this paper, we use hardware and software co-operation to design the FPGA computing unit and divide the layers according to their functions. Convolution and Transformer are classified into one category. Under the coordination deployment of software, it mix the outputs of the same type of layers through soft switches, so as to adapt to those flexible models. Compared with the performance of the original model on CPU, it achieves the acceleration performance of 26x under the condition that the accuracy is only decreased by 0.9%. And the structure of common data block reduces the size of hardware resource unit by 91.7%.
引用
收藏
页码:693 / 705
页数:13
相关论文
共 50 条
[21]   Hardware/Software Co-design for a Gender Recognition Embedded System [J].
Chen, Andrew Tzer-Yeu ;
Biglari-Abhari, Morteza ;
Wang, Kevin I-Kai ;
Bouzerdoum, Abdesselam ;
Tivive, Fok Hing Chi .
TRENDS IN APPLIED KNOWLEDGE-BASED SYSTEMS AND DATA SCIENCE, 2016, 9799 :541-552
[22]   Hardware-software co-design of inspection robot system [J].
Bi F. ;
Zhou G. ;
Zhang C. ;
Ji S. ;
Peng L. ;
Yan R. .
Zhongguo Shiyou Daxue Xuebao (Ziran Kexue Ban)/Journal of China University of Petroleum (Edition of Natural Science), 2024, 48 (03) :180-187
[23]   Hardware/Software Co-design for A Wireless Sensor Network Platform [J].
Hsieh, Chih-Ming ;
Samie, Farzad ;
Srouji, M. Sammer ;
Wang, Manyi ;
Wang, Zhonglei ;
Henkel, Joerg .
2014 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2014,
[24]   Hardware/software co-design for RSVP-TE MPLS [J].
Peterkin, Raymond ;
Ionescu, Dan .
2006 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-5, 2006, :2227-+
[25]   Whirlpool SoPC Implementation - Hardware/Software Co-Design Example [J].
Krawczyk, Kamil ;
Tomaszewicz, Pawel ;
Rawski, Mariusz .
INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2012, 58 (01) :21-26
[26]   Convolutional neural network acceleration with hardware/software co-design [J].
Chen, Andrew Tzer-Yeu ;
Biglari-Abhari, Morteza ;
Wang, Kevin I-Kai ;
Bouzerdoum, Abdesselam ;
Tivive, Fok Hing Chi .
APPLIED INTELLIGENCE, 2018, 48 (05) :1288-1301
[27]   Hardware/Software Co-design for Accelerating Human Action Recognition [J].
Alhammami, Muhammad ;
Pun, Ooi Chee ;
Haw, Tan Wooi .
2015 IEEE CONFERENCE ON SUSTAINABLE UTILIZATION AND DEVELOPMENT IN ENGINEERING AND TECHNOLOGY (CSUDET), 2015,
[28]   Convolutional neural network acceleration with hardware/software co-design [J].
Andrew Tzer-Yeu Chen ;
Morteza Biglari-Abhari ;
Kevin I-Kai Wang ;
Abdesselam Bouzerdoum ;
Fok Hing Chi Tivive .
Applied Intelligence, 2018, 48 :1288-1301
[29]   Hardware-Software Co-design of QRD-RLS Algorithm with Microblaze Soft Core Processor [J].
Lodha, Nupur ;
Rai, Nivesh ;
Dubey, Rahul ;
Venkataraman, Hrishikesh .
INFORMATION SYSTEMS, TECHNOLOGY AND MANAGEMENT-THIRD INTERNATIONAL CONFERENCE, ICISTM 2009, 2009, 31 :197-207
[30]   Hardware-Software Co-design for BLDC Motor Speed Controller Design [J].
Alecsa, Bogdan ;
Onea, Alexandru .
ADVANCED MATERIALS RESEARCH II, PTS 1 AND 2, 2012, 463-464 :1256-+