Advancing Real-World Stereoscopic Image Super-Resolution via Vision-Language Model

被引:0
|
作者
Zhang, Zhe [1 ,2 ]
Lei, Jianjun [1 ]
Peng, Bo [1 ]
Zhu, Jie [1 ]
Xu, Liying [1 ]
Huang, Qingming [3 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Tianjin Univ Commerce, Sch Informat Engn, Tianjin 300134, Peoples R China
[3] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Stereo image processing; Degradation; Superresolution; Visualization; Image reconstruction; Training; Iterative methods; Solid modeling; Computational modeling; Cognition; Super-resolution; stereoscopic image; vision-language model;
D O I
10.1109/TIP.2025.3546470
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have witnessed the remarkable success of the vision-language model in various computer vision tasks. However, how to exploit the semantic language knowledge of the vision-language model to advance real-world stereoscopic image super-resolution remains a challenging problem. This paper proposes a vision-language model-based stereoscopic image super-resolution (VLM-SSR) method, in which the semantic language knowledge in CLIP is exploited to facilitate stereoscopic image SR in a training-free manner. Specifically, by designing visual prompts for CLIP to infer the region similarity, a prompt-guided information aggregation mechanism is presented to capture inter-view information among relevant regions between the left and right views. Besides, driven by the prior knowledge of CLIP, a cognition prior-driven iterative enhancing mechanism is presented to optimize fuzzy regions adaptively. Experimental results on four datasets verify the effectiveness of the proposed method.
引用
收藏
页码:2187 / 2197
页数:11
相关论文
共 50 条
  • [21] Single Image Super-Resolution Quality Assessment: A Real-World Dataset, Subjective Studies, and an Objective Metric
    Jiang, Qiuping
    Liu, Zhentao
    Gu, Ke
    Shao, Feng
    Zhang, Xinfeng
    Liu, Hantao
    Lin, Weisi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 2279 - 2294
  • [22] Learning to Zoom-In via Learning to Zoom-Out: Real-World Super-Resolution by Generating and Adapting Degradation
    Sun, Wei
    Gong, Dong
    Shi, Qinfeng
    van den Hengel, Anton
    Zhang, Yanning
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2947 - 2962
  • [23] Frequency-Aware Degradation Modeling for Real-World Thermal Image Super-Resolution
    Qu, Chao
    Chen, Xiaoyu
    Xu, Qihan
    Han, Jing
    ENTROPY, 2024, 26 (03)
  • [24] Direct Unsupervised Super-Resolution Using Generative Adversarial Network (DUS-GAN) for Real-World Data
    Prajapati, Kalpesh
    Chudasama, Vishal
    Patel, Heena
    Upla, Kishor
    Raja, Kiran
    Ramachandra, Raghavendra
    Busch, Christoph
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 8251 - 8264
  • [25] Real-World Person Re-Identification via Super-Resolution and Semi-Supervised Methods
    Xia, Limin
    Zhu, Jiahui
    Yu, Zhimin
    IEEE ACCESS, 2021, 9 : 35834 - 35845
  • [26] Learning the Frequency Domain Aliasing for Real-World Super-Resolution
    Hao, Yukun
    Yu, Feihong
    ELECTRONICS, 2024, 13 (02)
  • [27] Real-World super-resolution under the guidance of optimal transport
    Zezeng Li
    Na Lei
    Ji Shi
    Hao Xue
    Machine Vision and Applications, 2022, 33
  • [28] Real-World super-resolution under the guidance of optimal transport
    Li, Zezeng
    Lei, Na
    Shi, Ji
    Xue, Hao
    MACHINE VISION AND APPLICATIONS, 2022, 33 (03)
  • [29] AdaDiffSR: Adaptive Region-Aware Dynamic Acceleration Diffusion Model for Real-World Image Super-Resolution
    Fan, Yuanting
    Liu, Chengxu
    Yin, Nengzhong
    Gao, Changlong
    Qian, Xueming
    COMPUTER VISION - ECCV 2024, PT XII, 2025, 15070 : 396 - 413
  • [30] Unsupervised Image Super-Resolution for High-Resolution Satellite Imagery via Omnidirectional Real-to-Synthetic Domain Translation
    Chung, Minkyung
    Kim, Yongil
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 4427 - 4445