Multi-Scale Motion Alignment and Frame Reconstruction for Efficient Deep Video Compression

被引:0
作者
Yang, Gongning [1 ]
Wei, Xiaojie [1 ]
Lin, Hongbin [1 ]
机构
[1] Fuzhou Univ, Fujian Key Lab Intelligent Proc & Wireless Transmi, Fuzhou 350116, Peoples R China
关键词
Convolution; Decoding; Motion compensation; Video compression; Feature extraction; Encoding; Video codecs; Deep video compression; end-to-end video codec; flexible rate adjustment; video coding;
D O I
10.1109/LSP.2024.3443516
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As video data continues to grow, the burden on network transmission increases significantly. Efficient video compression techniques are crucial to meet the rising demand for multimedia content. In this letter, we propose a Multi-scale Motion Alignment and Frame Reconstruction-based Video Codec (MFVC) for efficient video compression. MFVC focuses on optimizing the motion compensation and video reconstruction processes within a deep video compression framework. First, we design a Multi-Scale Motion Alignment Network (MSMA-Net) to achieve precise motion compensation, which extracts multi-scale features from video frames and utilizes flow information for deformable convolution. Second, we design a Frame Reconstruction Network (FR-Net) to recover high-quality video frames, which utilizes reference information for feature enhancement without additional bitrate consumption. Moreover, to achieve smooth rate adjustment, we introduce a feature scaling technique. Experimental results show that MFVC reduces bitrate by 7.86%/48.34% compared to VVC (VTM 13.2) at the same PSNR/MS-SSIM.
引用
收藏
页码:2125 / 2129
页数:5
相关论文
共 50 条
  • [21] The Interpretable Fast Multi-Scale Deep Decoder for the Standard HEVC Bitstreams
    Xiao, Wenhui
    He, Huiguo
    Wang, Tingting
    Chao, Hongyang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (07) : 1680 - 1691
  • [22] Performance enhanced spatial video compression using global affine frame reconstruction
    Dolly, D. Raveena Judie
    Bala, G. Josemin
    Peter, J. Dinesh
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2017, 18 : 1 - 11
  • [23] Deep Neural Network Based Frame Reconstruction for Optimized Video Coding
    Ding, Dandan
    Liu, Peng
    Chen, Yu
    Zhu, Zheng
    Liu, Zoe
    Bankoski, James
    [J]. ARTIFICIAL INTELLIGENCE AND MOBILE SERVICES - AIMS 2018, 2018, 10970 : 235 - 242
  • [24] Multi-Scale Deep Information and Adaptive Attention Mechanism Based Coronary Reconstruction of Superior Mesenteric Artery
    Zhang, Kun
    Han, Yu
    Xu, Peixia
    Wang, Meirong
    Yang, Jushun
    Lin, Pengcheng
    Crookes, Danny
    He, Bosheng
    Hua, Liang
    [J]. IEEE ACCESS, 2023, 11 : 4042 - 4056
  • [25] Multi-Scale Deep Representation Aggregation for Vein Recognition
    Pan, Zaiyu
    Wang, Jun
    Wang, Guoqing
    Zhu, Jihong
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 : 1 - 15
  • [26] Frame Rate Conversion Based High Efficient Compression Method for Video Satellite
    Wang, Xu
    Hu, Ruimin
    Xiao, Jing
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 35 - 44
  • [27] Learning Semantic Alignment Using Global Features and Multi-Scale Confidence
    Xu, Huaiyuan
    Liao, Jing
    Liu, Huaping
    Sun, Yuxiang
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 897 - 910
  • [28] HEVC Video Steganalysis Based on PU Maps and Multi-Scale Convolutional Residual Network
    Dai, Haojun
    Wang, Rangding
    Xu, Dawen
    He, Songhan
    Yang, Lin
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2663 - 2676
  • [29] Deep Attention and Multi-Scale Networks for Accurate Remote Sensing Image Segmentation
    Qi, Xingqun
    Li, Kaiqi
    Liu, Pengkun
    Zhou, Xiaoguang
    Sun, Muyi
    [J]. IEEE ACCESS, 2020, 8 (08): : 146627 - 146639
  • [30] EMSFomer: Efficient Multi-Scale Transformer for Real-Time Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    [J]. IEEE ACCESS, 2025, 13 : 18239 - 18252