Multi-Scale Motion Alignment and Frame Reconstruction for Efficient Deep Video Compression

被引:1
作者
Yang, Gongning [1 ]
Wei, Xiaojie [1 ]
Lin, Hongbin [1 ]
机构
[1] Fuzhou Univ, Fujian Key Lab Intelligent Proc & Wireless Transmi, Fuzhou 350116, Peoples R China
关键词
Convolution; Decoding; Motion compensation; Video compression; Feature extraction; Encoding; Video codecs; Deep video compression; end-to-end video codec; flexible rate adjustment; video coding;
D O I
10.1109/LSP.2024.3443516
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As video data continues to grow, the burden on network transmission increases significantly. Efficient video compression techniques are crucial to meet the rising demand for multimedia content. In this letter, we propose a Multi-scale Motion Alignment and Frame Reconstruction-based Video Codec (MFVC) for efficient video compression. MFVC focuses on optimizing the motion compensation and video reconstruction processes within a deep video compression framework. First, we design a Multi-Scale Motion Alignment Network (MSMA-Net) to achieve precise motion compensation, which extracts multi-scale features from video frames and utilizes flow information for deformable convolution. Second, we design a Frame Reconstruction Network (FR-Net) to recover high-quality video frames, which utilizes reference information for feature enhancement without additional bitrate consumption. Moreover, to achieve smooth rate adjustment, we introduce a feature scaling technique. Experimental results show that MFVC reduces bitrate by 7.86%/48.34% compared to VVC (VTM 13.2) at the same PSNR/MS-SSIM.
引用
收藏
页码:2125 / 2129
页数:5
相关论文
共 50 条
[31]   Frame Rate Conversion Based High Efficient Compression Method for Video Satellite [J].
Wang, Xu ;
Hu, Ruimin ;
Xiao, Jing .
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 :35-44
[32]   Learning Semantic Alignment Using Global Features and Multi-Scale Confidence [J].
Xu, Huaiyuan ;
Liao, Jing ;
Liu, Huaping ;
Sun, Yuxiang .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) :897-910
[33]   HEVC Video Steganalysis Based on PU Maps and Multi-Scale Convolutional Residual Network [J].
Dai, Haojun ;
Wang, Rangding ;
Xu, Dawen ;
He, Songhan ;
Yang, Lin .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) :2663-2676
[34]   Deep Attention and Multi-Scale Networks for Accurate Remote Sensing Image Segmentation [J].
Qi, Xingqun ;
Li, Kaiqi ;
Liu, Pengkun ;
Zhou, Xiaoguang ;
Sun, Muyi .
IEEE ACCESS, 2020, 8 :146627-146639
[35]   An efficient motion field representation using JBIG approach for video compression [J].
Tseng, SY .
ELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY III, 2002, 4925 :191-199
[36]   EMSFomer: Efficient Multi-Scale Transformer for Real-Time Semantic Segmentation [J].
Xia, Zhengyu ;
Kim, Joohee .
IEEE ACCESS, 2025, 13 :18239-18252
[37]   Preprocessing for Multi-Dimensional Enhancement and Reconstruction in Neural Video Compression [J].
Wang, Jiajia ;
Zhang, Qi ;
Zhao, Haiwu ;
Wang, Guozhong ;
Shang, Xiwu .
APPLIED SCIENCES-BASEL, 2024, 14 (19)
[38]   ADAPTIVE MULTI-SCALE PROGRESSIVE PROBABILITY MODEL FOR LOSSLESS IMAGE COMPRESSION [J].
Zhang, Honglei ;
Cricri, Francesco ;
Zou, Nannan ;
Tavakoli, Hamed R. ;
Hannuksela, Miska M. .
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, :721-725
[39]   HDR Image Compression by Multi-Scale down Sampling of Intensity Levels [J].
Swamy, A. S. Anand ;
Shylashree, N. .
INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2021, 21 (04)
[40]   Regional Cardiac Motion Scoring With Multi-Scale Motion-Based Spatial Attention [J].
Xue, Wufeng ;
Chen, Zejian ;
Wang, Tianfu ;
Li, Shuo ;
Ni, Dong .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (07) :3116-3126