Multi-modal degradation feature learning for unified image restoration based on contrastive learning

被引:0
|
作者
Chen, Lei [1 ]
Xiong, Qingbo [1 ]
Zhang, Wei [1 ,2 ]
Liang, Xiaoli [1 ]
Gan, Zhihua [1 ]
Li, Liqiang [3 ]
He, Xin [1 ]
机构
[1] Henan Univ, Sch Software, Jinming Rd, Kaifeng 475004, Peoples R China
[2] China Univ Labor Relat, Sch Appl Technol, Zengguang Rd, Beijing 100048, Peoples R China
[3] Shangqiu Normal Univ, Sch Phys, Shangqiu 476000, Peoples R China
基金
美国国家科学基金会;
关键词
Unified image restoration; Multi-modal features; Contrastive learning; Deep learning;
D O I
10.1016/j.neucom.2024.128955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the unified image restoration challenge by reframing it as a contrastive learning- based classification problem. Despite the significant strides made by deep learning methods in enhancing image restoration quality, their limited capacity to generalize across diverse degradation types and intensities necessitates the training of separate models for each specific degradation scenario. We proposes an all- encompassing approach that can restore images from various unknown corruption types and levels. We devise a method that learns representations of the latent sharp image's degradation and accompanying textual features (such as dataset categories and image content descriptions), converting these into prompts which are then embedded within a reconstruction network model to enhance cross-database restoration performance. This culminates in a unified image reconstruction framework. The study involves two stages: In the first stage, we design a MultiContentNet that learns multi-modal features (MMFs) of the latent sharp image. This network encodes the visual degradation expressions and contextual text features into latent variables, thereby exerting a guided classification effect. Specifically, MultiContentNet is trained as an auxiliary controller capable of taking the degraded input image and, through contrastive learning, extracts MMFs of the latent target image. This effectively generates natural classifiers tailored for different degradation types. The second phase integrates the learned MMFs into an image restoration network via cross-attention mechanisms. This guides the restoration model to learn high-fidelity image recovery. Experiments conducted on six blind image restoration tasks demonstrate that the proposed method achieves state-of-the-art performance, highlighting the potential significance of large-scale pretrained vision-language models' MMFs in advancing high-quality unified image reconstruction.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Effective deep learning-based multi-modal retrieval
    Wei Wang
    Xiaoyan Yang
    Beng Chin Ooi
    Dongxiang Zhang
    Yueting Zhuang
    The VLDB Journal, 2016, 25 : 79 - 101
  • [42] Effective deep learning-based multi-modal retrieval
    Wang, Wei
    Yang, Xiaoyan
    Ooi, Beng Chin
    Zhang, Dongxiang
    Zhuang, Yueting
    VLDB JOURNAL, 2016, 25 (01): : 79 - 101
  • [43] Multi-modal learning-based algae phyla identification using image and particle modalities
    Kwon, Do Hyuck
    Lee, Min Jun
    Jeong, Heewon
    Park, Sanghun
    Cho, Kyung Hwa
    WATER RESEARCH, 2025, 275
  • [44] RecFormer: Recurrent Multi-modal Transformer with History-Aware Contrastive Learning for Visual Dialog
    Lu, Liucun
    Qin, Jinghui
    Jie, Zequn
    Ma, Lin
    Lin, Liang
    Liang, Xiaodan
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 159 - 171
  • [45] EMP: Emotion-guided Multi-modal Fusion and Contrastive Learning for Personality Traits Recognition
    Wang, Yusong
    Li, Dongyuan
    Funakoshi, Kotaro
    Okumura, Manabu
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 243 - 252
  • [46] Multi-Modal Representation via Contrastive Learning with Attention Bottleneck Fusion and Attentive Statistics Features
    Guo, Qinglang
    Liao, Yong
    Li, Zhe
    Liang, Shenglin
    ENTROPY, 2023, 25 (10)
  • [47] Automated Diagnosis of Major Depressive Disorder With Multi-Modal MRIs Based on Contrastive Learning: A Few-Shot Study
    Li, Tongtong
    Guo, Yuhui
    Zhao, Ziyang
    Chen, Miao
    Lin, Qiang
    Hu, Xiping
    Yao, Zhijun
    Hu, Bin
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2024, 32 : 1566 - 1576
  • [48] Deep learning supported breast cancer classification with multi-modal image fusion
    Hamdy, Eman
    Zaghloul, Mohamed Saad
    Badawy, Osama
    2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 319 - 325
  • [49] Multi-Modal Learning for Predicting the Genotype of Glioma
    Wei, Yiran
    Chen, Xi
    Zhu, Lei
    Zhang, Lipei
    Schonlieb, Carola-Bibiane
    Price, Stephen
    Li, Chao
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (11) : 3167 - 3178
  • [50] Multi-modal deep learning for landform recognition
    Du, Lin
    You, Xiong
    Li, Ke
    Meng, Liqiu
    Cheng, Gong
    Xiong, Liyang
    Wang, Guangxia
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 158 : 63 - 75