CVIformer: Cross-View Interactive Transformer for Efficient Stereoscopic Image Super-Resolution

被引:0
作者
Zhang, Dongyang [1 ]
Liang, Shuang [1 ]
He, Tao [1 ]
Shao, Jie [2 ]
Qin, Ke [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Intelligent Comp, Chengdu 611731, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2024年
基金
中国国家自然科学基金;
关键词
Transformers; Stereo image processing; Feature extraction; Superresolution; Spatial resolution; Cameras; Task analysis; Stereoscopic image processing; lightweight transformer; super-resolution (SR); PARALLAX ATTENTION; NETWORK; MODULE;
D O I
10.1109/TETCI.2024.3436904
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inspired by the great success of the Transformer in computer vision, some works have started to explore the use of the Transformer for super-resolution (SR). However, with regard to stereoscopic SR, which aims to recover details from input pairs, how to efficiently integrate cross-view interactions into the Transformer architecture is still an ongoing development. Additionally, most existing stereoscopic SR methods only adopt a parallax mechanism in the middle of the network, and another issue is that the feature correlation from different viewpoints inevitably weakens as the network depth increases. To address these issues, we first utilize an efficient residual transformer block (ERTB) as the backbone for long-range intra-view feature extraction. Subsequently, we propose a novel multi-Dconv cross attentive block (MCAB) to enhance the cross-view interactions at the rear part of the Transformer architecture. Notably, the proposed MCAB promotes feature fusion from two viewpoints by employing bidirectional cross-attention, as opposed to an unidirectional flow from left to right or vice versa. This approach results in an efficient cross-view interaction from both branches. By leveraging the advantages of the proposed ERTB and MCAB, we introduce an efficient cross-view interaction Transformer (CVIformer) for stereoscopic SR. This architecture is capable of incorporating long-range intra-view and cross-view information with an acceptable computational overhead. Without excessive complexity, extensive experiments conducted on four public datasets demonstrate that our model achieves state-of-the-art results using only 1.17 million parameters, with approximately a 40% reduction in parameters compared to leading methods like iPASSR.
引用
收藏
页码:1107 / 1118
页数:12
相关论文
共 50 条
  • [1] Steformer: Efficient Stereo Image Super-Resolution With Transformer
    Lin, Jianxin
    Yin, Lianying
    Wang, Yijun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8396 - 8407
  • [2] CVGSR: Stereo image Super-Resolution with Cross-View guidance *
    Chen, Wenfei
    Ni, Shijia
    Shao, Feng
    DISPLAYS, 2024, 83
  • [3] Super-Resolution Reconstruction for Stereoscopic Omnidirectional Display Systems via Dynamic Convolutions and Cross-View Transformer
    Chai, Xiongli
    Shao, Feng
    Chen, Hangwei
    Mu, Baoyang
    Ho, Yo-Sung
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [4] Interactformer: Interactive Transformer and CNN for Hyperspectral Image Super-Resolution
    Liu, Yaoting
    Hu, Jianwen
    Kang, Xudong
    Luo, Jing
    Fan, Shaosheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [5] Cross View Capture for Stereo Image Super-Resolution
    Zhu, Xiangyuan
    Guo, Kehua
    Fang, Hui
    Chen, Liang
    Ren, Sheng
    Hu, Bin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 3074 - 3086
  • [6] Recurrent Interaction Network for Stereoscopic Image Super-Resolution
    Zhang, Zhe
    Peng, Bo
    Lei, Jianjun
    Shen, Haifeng
    Huang, Qingming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2048 - 2060
  • [7] Stereoscopic image super-resolution with interactive memory learning
    Zhu, Xiangyuan
    Guo, Kehua
    Qiu, Tian
    Fang, Hui
    Wu, Zheng
    Tan, Xuyang
    Liu, Chao
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
  • [8] Efficient Swin Transformer for Remote Sensing Image Super-Resolution
    Kang, Xudong
    Duan, Puhong
    Li, Jier
    Li, Shutao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6367 - 6379
  • [9] Deep Stereoscopic Image Super-Resolution via Interaction Module
    Lei, Jianjun
    Zhang, Zhe
    Fan, Xiaoting
    Yang, Bolan
    Li, Xinxin
    Chen, Ying
    Huang, Qingming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (08) : 3051 - 3061
  • [10] Strong-Weak Cross-View Interaction Network for Stereo Image Super-Resolution
    He, Kun
    Li, Changyu
    Shao, Jie
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 550 - 554