FSS: algorithm and neural network accelerator for style transfer

被引：0

作者：

Ling, Yi ^{[1
]}

Huang, Yujie ^{[1
,2
]}

Cai, Yujie ^{[1
,2
]}

Li, Zhaojie ^{[1
,2
]}

Wang, Mingyu ^{[1
]}

Li, Wenhong ^{[1
]}

Zeng, Xiaoyang ^{[1
]}

机构：

[1] Fudan Univ, State Key Lab AS & Syst, Shanghai 201203, Peoples R China

[2] Shanghai ExploreX Technol Co Ltd, Shanghai 200120, Peoples R China

来源：

SCIENCE CHINA-INFORMATION SCIENCES | 2024年 / 67卷 / 02期

基金：

中国国家自然科学基金;

关键词：

neural network accelerator; style transfer; neural network; deep learning;

D O I：

10.1007/s11432-022-3676-2

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Neural networks (NNs), owing to their impressive performance, have gradually begun to dominate multimedia processing. For resource-constrained and energy-sensitive mobile devices, an efficient NN accelerator is necessary. Style transfer is an important multimedia application. However, existing arbitrary style transfer networks are complex and not well supported by current NN accelerators, limiting their application on mobile devices. Moreover, the quality of style transfer needs improvement. Thus, we design the FastStyle system (FSS), where a novel algorithm and an NN accelerator are proposed for style transfer. In FSS, we first propose a novel arbitrary style transfer algorithm, FastStyle. We propose a light network that contributes to high quality and low computational complexity and a prior mechanism to avoid retraining when the style changes. Then, we redesign an NN accelerator for FastStyle by applying two improvements to the basic NVIDIA deep learning accelerator (NVDLA) architecture. First, a flexible dat FSM and wt FSM are redesigned to enable the original data path to perform other operations (including the GRAM operation) by software programming. Moreover, statistics and judgment logic are designed to utilize the continuity of a video stream and remove the data dependency in the instance normalization, which improves the accelerator performance by 18.6%. The experimental results demonstrate that the proposed FastStyle can achieve higher quality with a lower computational cost, making it more suitable for mobile devices. The proposed NN accelerator is implemented on the Xilinx VCU118 FPGA under a 180-MHz clock. Experimental results show that the accelerator can stylize 512x512-pixel video with 20 FPS, and the measured performance reaches up to 306.07 GOPS. The ASIC implementation in TSMC 28 nm achieves about 22 FPS in the case of a 720-p video.

引用

页数：14

共 48 条

[1] Abadi M, 2016, arXiv, DOI DOI 10.48550/ARXIV.1603.04467
[2] Real Image Denoising with Feature Attention
Anwar, Saeed
Barnes, Nick
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3155 - 3164
[3] DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning
Chen, Tianshi
Du, Zidong
Sun, Ninghui
Wang, Jia
Wu, Chengyong
Chen, Yunji
Temam, Olivier
[J]. ACM SIGPLAN NOTICES, 2014, 49 (04) : 269 - 283
[4] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
[5] Structure-Preserving Neural Style Transfer
Cheng, Ming-Ming
Liu, Xiao-Chang
Wang, Jie
Lu, Shao-Ping
Lai, Yu-Kun
Rosin, Paul L.
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 909 - 920
[6] Describing Textures in the Wild
Cimpoi, Mircea
Maji, Subhransu
Kokkinos, Iasonas
Mohamed, Sammy
Vedaldi, Andrea
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3606 - 3613
[7] Image Super-Resolution Using Deep Convolutional Networks
Dong, Chao
Loy, Chen Change
He, Kaiming
Tang, Xiaoou
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) : 295 - 307
[8] Dumoulin V., 2017, ICLR
[9] Image Style Transfer Using Convolutional Neural Networks
Gatys, Leon A.
Ecker, Alexander S.
Bethge, Matthias
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2414 - 2423
[10] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778

← 1 2 3 4 5 →