A Real-Time Convolutional Neural Network for Super-Resolution on FPGA With Applications to 4K UHD 60 fps Video Services

被引:65
作者
Kim, Yongwoo [1 ]
Choi, Jae-Seok [1 ]
Kim, Munchurl [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
关键词
Super-resolution; 4K UHD; deep learning; CNN; real-time; FPGA; IMAGE SUPERRESOLUTION;
D O I
10.1109/TCSVT.2018.2864321
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we present a novel hardware-friendly super-resolution (SR) method based on a convolutional neural network (CNN) and its dedicated hardware (HW) on field programmable gate array (FPGA). Although CNN-based SR methods have shown very promising results for SR, their computational complexities are prohibitive for hardware implementation. To the best of our knowledge, we are the first to implement a real-time CNN-based SR HW that upscales 2K full high-definition video to 4K ultra high-definition (UHD) video at 60 frames per second (fps). In our dedicated CNN-based SR HW, low-resolution input frames are processed line-by-line, and the number of convolutional filter parameters is reduced significantly by incorporating depth-wise separable convolutions with a residual connection. Our CNN-based SR HW incorporates a cascade of 1D convolutions having large receptive fields along horizontal lines while keeping vertical receptive fields minimal, which allows us to save required line memory space in achieving comparable SR performance against full 2D convolution operations. For efficient HW implementation, we use a simple and effective quantization method with little peak signal-to-noise ratio (PSNR) degradation. Also, we propose a compression method to efficiently store intermediate feature map data to reduce the number of line memories used in HW. Our HW implementation on the FPGA generates 4K UHD frames of higher PSNR values at 60 fps and shows better visual quality, compared with conventional CNN-based SR methods that are trained and tested in software.
引用
收藏
页码:2521 / 2534
页数:14
相关论文
共 62 条
  • [1] Abadi M., 2015, TensorFlow: Large-scale machine learning on heterogeneous systems
  • [2] [Anonymous], FLOATING POINT FIXED
  • [3] [Anonymous], 2015, REDUCED PRECISION ST
  • [4] [Anonymous], CAFFE VDSR
  • [5] [Anonymous], P 54 DES AUT C DAC A
  • [6] [Anonymous], 2015, ICLR
  • [7] [Anonymous], REAL TIME YCOCG DXT
  • [8] [Anonymous], P NEUR INF PROC SYST
  • [9] [Anonymous], 2016, 49 ANN IEEE ACM INT
  • [10] [Anonymous], TB FMCH HDMI4K HARDW