Convolutional Neural Network-Based Synthesized View Quality Enhancement for 3D Video Coding

被引:31
作者
Zhu, Linwei [1 ,2 ]
Zhang, Yun [3 ]
Wang, Shiqi [1 ,2 ]
Yuan, Hui [4 ]
Kwong, Sam [1 ,2 ]
Ip, Horace H. -S. [1 ,2 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Shenzhen Res Inst, Shenzhen 518057, Peoples R China
[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[4] Shandong Univ, Sch Informat Sci & Engn, Jinan 250100, Shandong, Peoples R China
关键词
Convolutional neural network; view synthesis; depth coding; 3D high efficiency video coding; Lagrange multiplier; DISTORTION ESTIMATION; MULTIVIEW VIDEO; COMPRESSION; EXTENSIONS;
D O I
10.1109/TIP.2018.2858022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The quality of synthesized view plays an important role in the 3D video system. In this paper, to further improve the coding efficiency, a convolutional neural network (CNN)-based synthesized view quality enhancement method for 3D high efficiency video coding (HEVC) is proposed. First, the distortion elimination in synthesized view is formulated as an image restoration task with the aim to reconstruct the latent distortion free synthesized image. Second, the learned CNN models are incorporated into 3D HEVC codec to improve the view synthesis performance for both view synthesis optimization (VSO) and the final synthesized view, where the geometric and compression distortions are considered according to the specific characteristics of synthesized view. Third, a new Lagrange multiplier in the rate-distortion cost function is derived to adapt the CNN-based VSO process to embrace a better 3D video coding performance. Extensive experimental results show that the proposed scheme can efficiently eliminate the artifacts in the synthesized image, and reduce 25.9% and 11.7% bit rate in terms of peak-signal-to-noise ratio and structural similarity index, which significantly outperforms the state-of-the-art methods.
引用
收藏
页码:5365 / 5377
页数:13
相关论文
共 36 条
[1]  
Bjontegaard G., 2001, M33 ITUT
[2]   A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding [J].
Dai, Yuanying ;
Liu, Dong ;
Wu, Feng .
MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 :28-39
[3]   Compression Artifacts Reduction by a Deep Convolutional Network [J].
Dong, Chao ;
Deng, Yubin ;
Loy, Chen Change ;
Tang, Xiaoou .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :576-584
[4]   Estimation of Virtual View Synthesis Distortion Toward Virtual View Position [J].
Fang, Lu ;
Xiang, Yijian ;
Cheung, Ngai-Man ;
Wu, Feng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (05) :1961-1976
[5]   Depth-image-based rendering (DIBR), compression and transmission for a new approach on 3D-TV [J].
Fehn, C .
STEREOSCOPIC DISPLAYS AND VIRTUAL REALITY SYSTEMS XI, 2004, 5291 :93-104
[6]   Sample Adaptive Offset in the HEVC Standard [J].
Fu, Chih-Ming ;
Alshina, Elena ;
Alshin, Alexander ;
Huang, Yu-Wen ;
Chen, Ching-Yeh ;
Tsai, Chia-Yang ;
Hsu, Chih-Wei ;
Lei, Shaw-Min ;
Park, Jeong-Hoon ;
Han, Woo-Jin .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2012, 22 (12) :1755-1764
[7]  
G. Tech, 2016, JCT3VO0002
[8]  
Ioffe Sergey, 2015, P MACHINE LEARNING R, V37, P448, DOI [DOI 10.48550/ARXIV.1502.03167, DOI 10.5555/3015118.3045167]
[9]   Depth Map Coding Optimization Using Rendered View Distortion for 3D Video Coding [J].
Kim, Woo-Shik ;
Ortega, Antonio ;
Lai, Polin ;
Tian, Dong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (11) :3534-3545
[10]   Nongeometric Distortion Smoothing Approach for Depth Map Preprocessing [J].
Lee, Pei-Jun ;
Effendi .
IEEE TRANSACTIONS ON MULTIMEDIA, 2011, 13 (02) :246-254