Multi-Scale Motion Alignment and Frame Reconstruction for Efficient Deep Video Compression

被引：0

作者：

Yang, Gongning ^{[1
]}

Wei, Xiaojie ^{[1
]}

Lin, Hongbin ^{[1
]}

机构：

[1] Fuzhou Univ, Fujian Key Lab Intelligent Proc & Wireless Transmi, Fuzhou 350116, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2024年 / 31卷

关键词：

Convolution; Decoding; Motion compensation; Video compression; Feature extraction; Encoding; Video codecs; Deep video compression; end-to-end video codec; flexible rate adjustment; video coding;

D O I：

10.1109/LSP.2024.3443516

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

As video data continues to grow, the burden on network transmission increases significantly. Efficient video compression techniques are crucial to meet the rising demand for multimedia content. In this letter, we propose a Multi-scale Motion Alignment and Frame Reconstruction-based Video Codec (MFVC) for efficient video compression. MFVC focuses on optimizing the motion compensation and video reconstruction processes within a deep video compression framework. First, we design a Multi-Scale Motion Alignment Network (MSMA-Net) to achieve precise motion compensation, which extracts multi-scale features from video frames and utilizes flow information for deformable convolution. Second, we design a Frame Reconstruction Network (FR-Net) to recover high-quality video frames, which utilizes reference information for feature enhancement without additional bitrate consumption. Moreover, to achieve smooth rate adjustment, we introduce a feature scaling technique. Experimental results show that MFVC reduces bitrate by 7.86%/48.34% compared to VVC (VTM 13.2) at the same PSNR/MS-SSIM.

引用

页码：2125 / 2129

页数：5

共 50 条

[21] The Interpretable Fast Multi-Scale Deep Decoder for the Standard HEVC Bitstreams
Xiao, Wenhui
He, Huiguo
Wang, Tingting
Chao, Hongyang
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (07) : 1680 - 1691
[22] Performance enhanced spatial video compression using global affine frame reconstruction
Dolly, D. Raveena Judie
Bala, G. Josemin
Peter, J. Dinesh
[J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2017, 18 : 1 - 11
[23] Deep Neural Network Based Frame Reconstruction for Optimized Video Coding
Ding, Dandan
Liu, Peng
Chen, Yu
Zhu, Zheng
Liu, Zoe
Bankoski, James
[J]. ARTIFICIAL INTELLIGENCE AND MOBILE SERVICES - AIMS 2018, 2018, 10970 : 235 - 242
[24] Multi-Scale Deep Information and Adaptive Attention Mechanism Based Coronary Reconstruction of Superior Mesenteric Artery
Zhang, Kun
Han, Yu
Xu, Peixia
Wang, Meirong
Yang, Jushun
Lin, Pengcheng
Crookes, Danny
He, Bosheng
Hua, Liang
[J]. IEEE ACCESS, 2023, 11 : 4042 - 4056
[25] Multi-Scale Deep Representation Aggregation for Vein Recognition
Pan, Zaiyu
Wang, Jun
Wang, Guoqing
Zhu, Jihong
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 : 1 - 15
[26] Frame Rate Conversion Based High Efficient Compression Method for Video Satellite
Wang, Xu
Hu, Ruimin
Xiao, Jing
[J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 35 - 44
[27] Learning Semantic Alignment Using Global Features and Multi-Scale Confidence
Xu, Huaiyuan
Liao, Jing
Liu, Huaping
Sun, Yuxiang
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 897 - 910
[28] HEVC Video Steganalysis Based on PU Maps and Multi-Scale Convolutional Residual Network
Dai, Haojun
Wang, Rangding
Xu, Dawen
He, Songhan
Yang, Lin
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2663 - 2676
[29] Deep Attention and Multi-Scale Networks for Accurate Remote Sensing Image Segmentation
Qi, Xingqun
Li, Kaiqi
Liu, Pengkun
Zhou, Xiaoguang
Sun, Muyi
[J]. IEEE ACCESS, 2020, 8 (08): : 146627 - 146639
[30] EMSFomer: Efficient Multi-Scale Transformer for Real-Time Semantic Segmentation
Xia, Zhengyu
Kim, Joohee
[J]. IEEE ACCESS, 2025, 13 : 18239 - 18252

← 1 2 3 4 5 →