LEARNED VIDEO COMPRESSION WITH SPATIAL-TEMPORAL OPTIMIZATION

被引：1

作者：

Wang, Yiming ^{[1
,2
]}

Huang, Qian ^{[1
,2
]}

Tang, Bin ^{[1
,2
]}

Liu, Wenting ^{[1
,2
]}

Shan, Wenchao ^{[1
,2
]}

Xu, Qian ^{[1
,2
]}

机构：

[1] Hohai Univ, Coll Comp Sci & Software Engn, Nanjing, Peoples R China

[2] Hohai Univ, Key Lab Water Big Data Technol Minist Water Resou, Nanjing, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年

关键词：

learned video compression; motion vector; spatial-temporal motion refinement; In-loop filter;

D O I：

10.1109/ICASSP48485.2024.10446198

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Previous optical flow based video compression is gradually replaced by unsupervised deformable convolution (DCN) based method. This is mainly due to the fact that the motion vector (MV) estimated by the existing optical flow network is not accurate and may introduce extra artifacts. However, DCN based method is difficult for training owing to the lack of explicit guidance in the feature space. In this work, we propose a learned video compression with spatial-temporal optimization. Specifically, we first propose the spatial-temporal motion refinement module to improve the accuracy of MV estimated by the optical flow network for prediction. Then, we propose the In-loop filter module to remove compression artifacts and improve the reconstructed frame quality. Finally, comprehensive experimental results demonstrate our proposed method outperforms the recent learned methods on three benchmark datasets. Moreover, our method also beats the H.266/VVC in terms of MS-SSIM metrics.

引用

页码：3715 / 3719

页数：5

共 25 条

[1]

[Anonymous], 2023, ICASSP 2023, DOI DOI 10.1109/TKDE.2022.3233481

[2]

Begaint Jean, 2020, arXiv

[3]

Bjontegaard G., 2001, document VCEG-M33

[4] Overview of the Versatile Video Coding (VVC) Standard and its Applications [J].

Bross, Benjamin ;

Wang, Ye-Kui ;

Ye, Yan ;

Liu, Shan ;

Chen, Jianle ;

Sullivan, Gary J. ;

Ohm, Jens-Rainer .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (10) :3736-3764

[5] Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules [J].

Cheng, Zhengxue ;

Sun, Heming ;

Takeuchi, Masaru ;

Katto, Jiro .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :7936-7945

[6] Deformable Convolutional Networks [J].

Dai, Jifeng ;

Qi, Haozhi ;

Xiong, Yuwen ;

Li, Yi ;

Zhang, Guodong ;

Hu, Han ;

Wei, Yichen .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773

[7]

Duong Lyndon R., 2023, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P1, DOI 10.1109/ICASSP49357.2023.10095879

[8]

GAO H, 2022, ACM MM, P3055, DOI DOI 10.1145/3503161.3548156

[9] CANF-VC: Conditional Augmented Normalizing Flows for Video Compression [J].

Ho, Yung-Han ;

Chang, Chih-Peng ;

Chen, Peng-Yu ;

Gnutti, Alessandro ;

Peng, Wen-Hsiao .

COMPUTER VISION - ECCV 2022, PT XVI, 2022, 13676 :207-223

[10] FVC: A New Framework towards Deep Video Compression in Feature Space [J].

Hu, Zhihao ;

Lu, Guo ;

Xu, Dong .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1502-1511

← 1 2 3 →