PixRevive: Latent Feature Diffusion Model for Compressed Video Quality Enhancement

被引：0

作者：

Wang, Weiran ^{[1
]}

Jing, Minge ^{[1
]}

Fan, Yibo ^{[1
]}

Weng, Wei ^{[2
]}

机构：

[1] Fudan Univ, Sch Microelect, Shanghai 200433, Peoples R China

[2] Kanazawa Univ, Dept Liberal Arts & Sci, Ishikawa 9201192, Japan

来源：

SENSORS | 2024年 / 24卷 / 06期

关键词：

compressed video restoration; diffusion model; rich detail information; group-wise domain fusion;

D O I：

10.3390/s24061907

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

In recent years, the rapid prevalence of high-definition video in Internet of Things (IoT) systems has been directly facilitated by advances in imaging sensor technology. To adapt to limited uplink bandwidth, most media platforms opt to compress videos to bitrate streams for transmission. However, this compression often leads to significant texture loss and artifacts, which severely degrade the Quality of Experience (QoE). We propose a latent feature diffusion model (LFDM) for compressed video quality enhancement, which comprises a compact edge latent feature prior network (ELPN) and a conditional noise prediction network (CNPN). Specifically, we first pre-train ELPNet to construct a latent feature space that captures rich detail information for representing sharpness latent variables. Second, we incorporate these latent variables into the prediction network to iteratively guide the generation direction, thus resolving the problem that the direct application of diffusion models to temporal prediction disrupts inter-frame dependencies, thereby completing the modeling of temporal correlations. Lastly, we innovatively develop a Grouped Domain Fusion module that effectively addresses the challenges of diffusion distortion caused by naive cross-domain information fusion. Comparative experiments on the MFQEv2 benchmark validate our algorithm's superior performance in terms of both objective and subjective metrics. By integrating with codecs and image sensors, our method can provide higher video quality.

引用

页数：18

共 56 条

[1] Baranchuk D., 2021, arXiv, DOI 10.48550/arXiv.2112.03126
[2] BasicVSR plus plus : Improving Video Super-Resolution with Enhanced Propagation and Alignment
Chan, Kelvin C. K.
Zhou, Shangchen
Xu, Xiangyu
Loy, Chen Change
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5962 - 5971
[3] BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
Chan, Kelvin C. K.
Wang, Xintao
Yu, Ke
Dong, Chao
Loy, Chen Change
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4945 - 4954
[4] DiffusionDet: Diffusion Model for Object Detection
Chen, Shoufa
Sun, Peize
Song, Yibing
Luo, Ping
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19773 - 19786
[5] ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models
Choi, Jooyoung
Kim, Sungwon
Jeong, Yonghyun
Gwon, Youngjune
Yoon, Sungroh
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14347 - 14356
[6] A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding
Dai, Yuanying
Liu, Dong
Wu, Feng
[J]. MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 28 - 39
[7] Deng JN, 2020, AAAI CONF ARTIF INTE, V34, P10696
[8] Dhariwal P, 2021, ADV NEUR IN, V34
[9] A Switchable Deep Learning Approach for In-Loop Filtering in Video Coding
Ding, Dandan
Kong, Lingyi
Chen, Guangyao
Liu, Zoe
Fang, Yong
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) : 1871 - 1887
[10] Compression Artifacts Reduction by a Deep Convolutional Network
Dong, Chao
Deng, Yubin
Loy, Chen Change
Tang, Xiaoou
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 576 - 584

← 1 2 3 4 5 6 →