PixRevive: Latent Feature Diffusion Model for Compressed Video Quality Enhancement

被引:0
作者
Wang, Weiran [1 ]
Jing, Minge [1 ]
Fan, Yibo [1 ]
Weng, Wei [2 ]
机构
[1] Fudan Univ, Sch Microelect, Shanghai 200433, Peoples R China
[2] Kanazawa Univ, Dept Liberal Arts & Sci, Ishikawa 9201192, Japan
关键词
compressed video restoration; diffusion model; rich detail information; group-wise domain fusion;
D O I
10.3390/s24061907
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
In recent years, the rapid prevalence of high-definition video in Internet of Things (IoT) systems has been directly facilitated by advances in imaging sensor technology. To adapt to limited uplink bandwidth, most media platforms opt to compress videos to bitrate streams for transmission. However, this compression often leads to significant texture loss and artifacts, which severely degrade the Quality of Experience (QoE). We propose a latent feature diffusion model (LFDM) for compressed video quality enhancement, which comprises a compact edge latent feature prior network (ELPN) and a conditional noise prediction network (CNPN). Specifically, we first pre-train ELPNet to construct a latent feature space that captures rich detail information for representing sharpness latent variables. Second, we incorporate these latent variables into the prediction network to iteratively guide the generation direction, thus resolving the problem that the direct application of diffusion models to temporal prediction disrupts inter-frame dependencies, thereby completing the modeling of temporal correlations. Lastly, we innovatively develop a Grouped Domain Fusion module that effectively addresses the challenges of diffusion distortion caused by naive cross-domain information fusion. Comparative experiments on the MFQEv2 benchmark validate our algorithm's superior performance in terms of both objective and subjective metrics. By integrating with codecs and image sensors, our method can provide higher video quality.
引用
收藏
页数:18
相关论文
共 56 条
  • [1] Baranchuk D., 2021, arXiv, DOI 10.48550/arXiv.2112.03126
  • [2] BasicVSR plus plus : Improving Video Super-Resolution with Enhanced Propagation and Alignment
    Chan, Kelvin C. K.
    Zhou, Shangchen
    Xu, Xiangyu
    Loy, Chen Change
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5962 - 5971
  • [3] BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
    Chan, Kelvin C. K.
    Wang, Xintao
    Yu, Ke
    Dong, Chao
    Loy, Chen Change
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4945 - 4954
  • [4] DiffusionDet: Diffusion Model for Object Detection
    Chen, Shoufa
    Sun, Peize
    Song, Yibing
    Luo, Ping
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19773 - 19786
  • [5] ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models
    Choi, Jooyoung
    Kim, Sungwon
    Jeong, Yonghyun
    Gwon, Youngjune
    Yoon, Sungroh
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14347 - 14356
  • [6] A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding
    Dai, Yuanying
    Liu, Dong
    Wu, Feng
    [J]. MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 28 - 39
  • [7] Deng JN, 2020, AAAI CONF ARTIF INTE, V34, P10696
  • [8] Dhariwal P, 2021, ADV NEUR IN, V34
  • [9] A Switchable Deep Learning Approach for In-Loop Filtering in Video Coding
    Ding, Dandan
    Kong, Lingyi
    Chen, Guangyao
    Liu, Zoe
    Fang, Yong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) : 1871 - 1887
  • [10] Compression Artifacts Reduction by a Deep Convolutional Network
    Dong, Chao
    Deng, Yubin
    Loy, Chen Change
    Tang, Xiaoou
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 576 - 584