Multi-modal degradation feature learning for unified image restoration based on contrastive learning

被引:0
|
作者
Chen, Lei [1 ]
Xiong, Qingbo [1 ]
Zhang, Wei [1 ,2 ]
Liang, Xiaoli [1 ]
Gan, Zhihua [1 ]
Li, Liqiang [3 ]
He, Xin [1 ]
机构
[1] Henan Univ, Sch Software, Jinming Rd, Kaifeng 475004, Peoples R China
[2] China Univ Labor Relat, Sch Appl Technol, Zengguang Rd, Beijing 100048, Peoples R China
[3] Shangqiu Normal Univ, Sch Phys, Shangqiu 476000, Peoples R China
基金
美国国家科学基金会;
关键词
Unified image restoration; Multi-modal features; Contrastive learning; Deep learning;
D O I
10.1016/j.neucom.2024.128955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the unified image restoration challenge by reframing it as a contrastive learning- based classification problem. Despite the significant strides made by deep learning methods in enhancing image restoration quality, their limited capacity to generalize across diverse degradation types and intensities necessitates the training of separate models for each specific degradation scenario. We proposes an all- encompassing approach that can restore images from various unknown corruption types and levels. We devise a method that learns representations of the latent sharp image's degradation and accompanying textual features (such as dataset categories and image content descriptions), converting these into prompts which are then embedded within a reconstruction network model to enhance cross-database restoration performance. This culminates in a unified image reconstruction framework. The study involves two stages: In the first stage, we design a MultiContentNet that learns multi-modal features (MMFs) of the latent sharp image. This network encodes the visual degradation expressions and contextual text features into latent variables, thereby exerting a guided classification effect. Specifically, MultiContentNet is trained as an auxiliary controller capable of taking the degraded input image and, through contrastive learning, extracts MMFs of the latent target image. This effectively generates natural classifiers tailored for different degradation types. The second phase integrates the learned MMFs into an image restoration network via cross-attention mechanisms. This guides the restoration model to learn high-fidelity image recovery. Experiments conducted on six blind image restoration tasks demonstrate that the proposed method achieves state-of-the-art performance, highlighting the potential significance of large-scale pretrained vision-language models' MMFs in advancing high-quality unified image reconstruction.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Estimation of Degradation Degree in Road Infrastructure Based on Multi-Modal ABN Using Contrastive Learning
    Higashi, Takaaki
    Ogawa, Naoki
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    SENSORS, 2023, 23 (03)
  • [2] Turbo your multi-modal classification with contrastive learning
    Zhang, Zhiyu
    Liu, Da
    Liu, Shengqiang
    Wang, Anna
    Gao, Jie
    Li, Yali
    INTERSPEECH 2023, 2023, : 1848 - 1852
  • [3] Contrastive Multi-Modal Knowledge Graph Representation Learning
    Fang, Quan
    Zhang, Xiaowei
    Hu, Jun
    Wu, Xian
    Xu, Changsheng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 8983 - 8996
  • [4] CLMTR: a generic framework for contrastive multi-modal trajectory representation learning
    Liang, Anqi
    Yao, Bin
    Xie, Jiong
    Zheng, Wenli
    Shen, Yanyan
    Ge, Qiqi
    GEOINFORMATICA, 2024, : 233 - 253
  • [5] Deep learning-based multi-modal computing with feature disentanglement for MRI image synthesis
    Fei, Yuchen
    Zhan, Bo
    Hong, Mei
    Wu, Xi
    Zhou, Jiliu
    Wang, Yan
    MEDICAL PHYSICS, 2021, 48 (07) : 3778 - 3789
  • [6] Improving Code Search with Multi-Modal Momentum Contrastive Learning
    Shi, Zejian
    Xiong, Yun
    Zhang, Yao
    Jiang, Zhijie
    Zhao, Jinjing
    Wang, Lei
    Li, Shanshan
    2023 IEEE/ACM 31ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC, 2023, : 280 - 291
  • [7] Improving Medical Multi-modal Contrastive Learning with Expert Annotations
    Kumar, Yogesh
    Marttinen, Pekka
    COMPUTER VISION - ECCV 2024, PT XX, 2025, 15078 : 468 - 486
  • [8] Multi-modal haptic image recognition based on deep learning
    Han, Dong
    Nie, Hong
    Chen, Jinbao
    Chen, Meng
    Deng, Zhen
    Zhang, Jianwei
    SENSOR REVIEW, 2018, 38 (04) : 486 - 493
  • [9] CrossMoCo: Multi-modal Momentum Contrastive Learning for Point Cloud
    Paul, Sneha
    Patterson, Zachary
    Bouguila, Nizar
    2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 273 - 280
  • [10] Collaborative denoised graph contrastive learning for multi-modal recommendation
    Xu, Fuyong
    Zhu, Zhenfang
    Fu, Yixin
    Wang, Ru
    Liu, Peiyu
    INFORMATION SCIENCES, 2024, 679