Multi-Frame Video Enhancement Using Virtual Frame Synthesized in Time Domain

被引：0

作者：

Ding D. ^{[1
]}

Wu X. ^{[1
]}

Tong J. ^{[1
]}

Yao Z. ^{[1
]}

Pan Z. ^{[1
,2
]}

机构：

[1] School of Information Science and Engineering, Hangzhou Normal University, Hangzhou

[2] Guangzhou NINED LLC, Guangzhou

来源：

Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | 2020年 / 32卷 / 05期

关键词：

Convolutional neural network; Motion compensation; Video enhancement; Virtual frame;

D O I：

10.3724/SP.J.1089.2020.17952

中图分类号：

学科分类号：

摘要：

The convolutional neural network based video enhancement can effectively reduces compression artifacts, improving both the video coding efficiency and subjective quality. State-of-the-art methods usually adopt the single-frame enhancement strategies. However, video frames are also highly correlated in temporal domain, indicating that the reconstructed frames in temporal domain can also provide useful information to enhance the quality of current frame. To sufficiently utilize the temporal information, this paper proposes a spatial-temporal video enhancement method by introducing a virtual frame in time domain. We first employ an adaptive network to predict the virtual frame of current frame from its neighboring reconstructed frames. This virtual frame carries abundant temporal information. On the other hand, the current frame is also highly correlated in spatial domain. Hence we can combine spatial-temporal information together for extensive enhancement. To this end, we develop an enhancing network, which is structured in a progressive fusion manner, to combine both the virtual frame and the current frame for further frame fusion. Experimental results show that under random access configuration, the proposed method can obtain an average gain of 0.38 dB and 0.06 dB PSNR compared to the anchor H.265/HEVC and the single-frame-based strategy. Moreover, it outperforms the state-of-the-art multi-frame quality enhancement network (MFQE) 0.26 dB PSNR, whereas the number of parameters is only 12.2% of MFQE. The proposed method also significantly improves the subjective quality of the compressed videos. © 2020, Beijing China Science Journal Publishing Co. Ltd. All right reserved.

引用

页码：780 / 786

页数：6

共 17 条

[1] Dong C, Loy C C, Tang X O., Accelerating the super-resolution convolutional neural network, Proceedings of European Conference on Computer Vision, pp. 391-407, (2016)
[2] Kim J, Lee J K, Lee K M., Accurate image super-resolution using very deep convolutional networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646-1654, (2016)
[3] Lai W S, Huang J B, Ahuja N, Et al., Deep Laplacian pyramid networks for fast and accurate super-resolution, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 624-632, (2017)
[4] Fan Y C, Yu J H, Huang T S., Wide-activated deep residual networks based restoration for BPG-compressed images, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2621-2624, (2018)
[5] Dong C, Deng Y B, Loy C C, Et al., Compression artifacts reduction by a deep convolutional network, Proceedings of the IEEE International Conference on Computer Vision, pp. 576-584, (2015)
[6] Zhang K, Zuo W M, Chen Y J, Et al., Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising, IEEE Transactions on Image Processing, 26, 7, pp. 3142-3155, (2017)
[7] Wang T T, Chen M J, Chao H Y., A novel deep learning-based method of improving coding efficiency from the decoder-end for HEVC, Proceedings of the Data Compression Conference, pp. 410-419, (2017)
[8] Yang R, Xu M, Wang Z L., Decoder-side HEVC quality enhancement with scalable convolutional neural network, Proceedings of the IEEE International Conference on Multimedia and Expo, pp. 817-822, (2017)
[9] Yang R, Xu M, Wang Z L, Et al., Multi-frame quality enhancement for compressed video, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6664-6673, (2018)
[10] Caballero J, Ledig C, Aitken A, Et al., Real-time video super-resolution with spatio-temporal networks and motion compensation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4778-4887, (2017)

← 1 2 →