Disparity-Aware Reference Frame Generation Network for Multiview Video Coding

被引:4
作者
Lei, Jianjun [1 ]
Zhang, Zongqian [1 ,2 ]
Pan, Zhaoqing [1 ]
Liu, Dong [3 ]
Liu, Xiangrui [1 ]
Chen, Ying [4 ]
Ling, Nam [5 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Alibaba Grp, Hangzhou 310052, Peoples R China
[3] Univ Sci & Technol China, CAS Key Lab Technol Geo Spatial Informat Proc & A, Hefei 230027, Peoples R China
[4] Alibaba Grp, Hangzhou 310052, Peoples R China
[5] Santa Clara Univ, Dept Comp Sci & Engn, Santa Clara, CA 95053 USA
基金
中国国家自然科学基金;
关键词
Image coding; Video coding; Deep learning; Image reconstruction; Estimation; Encoding; Task analysis; Multiview video coding; reference frame generation; disparity-aware alignment; DAG-Net; 3D-HEVC; VIEW SYNTHESIS; PREDICTION; EXTENSIONS;
D O I
10.1109/TIP.2022.3183436
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiview video coding (MVC) aims to compress the multiview video through the elimination of video redundancies, where the quality of the reference frame directly affects the compression efficiency. In this paper, we propose a deep virtual reference frame generation method based on a disparity-aware reference frame generation network (DAG-Net) to transform the disparity relationship between different viewpoints and generate a more reliable reference frame. The proposed DAG-Net consists of a multi-level receptive field module, a disparity-aware alignment module, and a fusion reconstruction module. First, a multi-level receptive field module is designed to enlarge the receptive field, and extract the multi-scale deep features of the temporal and inter-view reference frames. Then, a disparity-aware alignment module is proposed to learn the disparity relationship, and perform disparity shift on the inter-view reference frame to align it with the temporal reference frame. Finally, a fusion reconstruction module is utilized to fuse the complementary information and generate a more reliable virtual reference frame. Experiments demonstrate that the proposed reference frame generation method achieves superior performance for multiview video coding.
引用
收藏
页码:4515 / 4526
页数:12
相关论文
共 50 条
[31]   Deep Neural Network Based Frame Reconstruction for Optimized Video Coding [J].
Ding, Dandan ;
Liu, Peng ;
Chen, Yu ;
Zhu, Zheng ;
Liu, Zoe ;
Bankoski, James .
ARTIFICIAL INTELLIGENCE AND MOBILE SERVICES - AIMS 2018, 2018, 10970 :235-242
[32]   Virtual Background Reference Frame Based Satellite Video Coding [J].
Wang, Xu ;
Hu, Ruimin ;
Wang, Zhongyuan ;
Xiao, Jing .
IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (10) :1445-1449
[33]   VRFCNN: Virtual Reference Frame Generation Network for Quality SHVC [J].
Ding, Qing ;
Shen, Liquan ;
Yang, Hao ;
Dong, Xinchao ;
Xu, Mai .
IEEE SIGNAL PROCESSING LETTERS, 2020, 27 (27) :2049-2053
[34]   FRAME STRUCTURE OPTIMIZATION FOR INTERACTIVE MULTIVIEW VIDEO STREAMING WITH BOUNDED NETWORK DELAY [J].
Xiu, Xiaoyu ;
Cheung, Gene ;
Liang, Jie .
2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, :593-596
[35]   Deep Reference Frame Interpolation based Inter Prediction Enhancement for Versatile Video Coding [J].
Jia, Jianghao ;
Liu, Zizheng ;
Xu, Xiaozhong ;
Liu, Shan ;
Chen, Zhenzhong .
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
[36]   Apron surveillance video coding based on compositing virtual reference frame with object library [J].
Lyu, Zonglei ;
Zhang, Bo .
IET IMAGE PROCESSING, 2023, 17 (08) :2475-2488
[37]   Edge-Aware Network for Flow-Based Video Frame Interpolation [J].
Zhao, Bin ;
Li, Xuelong .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) :1401-1408
[38]   A Temporally-Aware Interpolation Network for Video Frame Inpainting [J].
Szeto, Ryan ;
Sun, Ximeng ;
Lu, Kunyi ;
Corso, Jason J. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (05) :1053-1068
[39]   A Network-Aware Error-Resilient Video Coding Using Adaptive Intra And Reference Selection Refresh [J].
Zhou, Yiran ;
Xu, Weiwei ;
Chen, Yaowu .
2009 INTERNATIONAL SYMPOSIUM ON COMPUTER NETWORK AND MULTIMEDIA TECHNOLOGY (CNMT 2009), VOLUMES 1 AND 2, 2009, :115-118
[40]   Content-Aware Prediction Algorithm With Inter-View Mode Decision for Multiview Video Coding [J].
Ding, Li-Fu ;
Tsung, Pei-Kuei ;
Chien, Shao-Yi ;
Chen, Wei-Yin ;
Chen, Liang-Gee .
IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (08) :1553-1564