Advancing Real-World Stereoscopic Image Super-Resolution via Vision-Language Model

被引:0
|
作者
Zhang, Zhe [1 ,2 ]
Lei, Jianjun [1 ]
Peng, Bo [1 ]
Zhu, Jie [1 ]
Xu, Liying [1 ]
Huang, Qingming [3 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Tianjin Univ Commerce, Sch Informat Engn, Tianjin 300134, Peoples R China
[3] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Stereo image processing; Degradation; Superresolution; Visualization; Image reconstruction; Training; Iterative methods; Solid modeling; Computational modeling; Cognition; Super-resolution; stereoscopic image; vision-language model;
D O I
10.1109/TIP.2025.3546470
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have witnessed the remarkable success of the vision-language model in various computer vision tasks. However, how to exploit the semantic language knowledge of the vision-language model to advance real-world stereoscopic image super-resolution remains a challenging problem. This paper proposes a vision-language model-based stereoscopic image super-resolution (VLM-SSR) method, in which the semantic language knowledge in CLIP is exploited to facilitate stereoscopic image SR in a training-free manner. Specifically, by designing visual prompts for CLIP to infer the region similarity, a prompt-guided information aggregation mechanism is presented to capture inter-view information among relevant regions between the left and right views. Besides, driven by the prior knowledge of CLIP, a cognition prior-driven iterative enhancing mechanism is presented to optimize fuzzy regions adaptively. Experimental results on four datasets verify the effectiveness of the proposed method.
引用
收藏
页码:2187 / 2197
页数:11
相关论文
共 50 条
  • [41] EFRG-SRGAN: combining augmented features for real-world super-resolution
    Yao, Yibing
    Cui, Zhisheng
    Wang, Dakai
    Zhang, Miaohui
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (6-7) : 5173 - 5187
  • [42] Arbitrary-Scale Image Super-Resolution via Degradation Perception
    Wan, Wenbo
    Wang, Zezhu
    Wang, Zhiyan
    Gu, Lingchen
    Sun, Jiande
    Wang, Qiang
    IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2024, 10 : 666 - 676
  • [43] A Conditional Diffusion Model With Fast Sampling Strategy for Remote Sensing Image Super-Resolution
    Meng, Fanen
    Chen, Yijun
    Jing, Haoyu
    Zhang, Laifu
    Yan, Yiming
    Ren, Yingchao
    Wu, Sensen
    Feng, Tian
    Liu, Renyi
    Du, Zhenhong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [44] Single Image Super-Resolution with Vision Loss Function
    Song, Yi-Zhen
    Liu, Wen-Yen
    Chen, Ju-Chin
    Lin, Kawuu W.
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT II, 2019, 11432 : 173 - 179
  • [45] CVIformer: Cross-View Interactive Transformer for Efficient Stereoscopic Image Super-Resolution
    Zhang, Dongyang
    Liang, Shuang
    He, Tao
    Shao, Jie
    Qin, Ke
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, : 1107 - 1118
  • [46] MambaFormerSR: A Lightweight Model for Remote-Sensing Image Super-Resolution
    Zhi, Ruicong
    Fan, Xiaopei
    Shi, Jingye
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [47] Image Super-Resolution Algorithm Based on RRDB Model
    Li, Huan
    IEEE ACCESS, 2021, 9 : 156260 - 156273
  • [48] Infrared Image Super-Resolution via Transfer Learning and PSRGAN
    Huang, Yongsong
    Jiang, Zetao
    Lan, Rushi
    Zhang, Shaoqin
    Pi, Kui
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 (28) : 982 - 986
  • [49] Image Super-Resolution Via Sparse Embedding
    Zhu, Qidan
    Sun, Lei
    Cai, Chengtao
    2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 5673 - 5676
  • [50] Multi-Modal Prior-Guided Diffusion Model for Blind Image Super-Resolution
    Huang, Detian
    Song, Jiaxun
    Huang, Xiaoqian
    Hu, Zhenzhen
    Zeng, Huanqiang
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 316 - 320