CVIformer: Cross-View Interactive Transformer for Efficient Stereoscopic Image Super-Resolution

被引:0
作者
Zhang, Dongyang [1 ]
Liang, Shuang [1 ]
He, Tao [1 ]
Shao, Jie [2 ]
Qin, Ke [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Intelligent Comp, Chengdu 611731, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2025年 / 9卷 / 02期
基金
中国国家自然科学基金;
关键词
Transformers; Stereo image processing; Feature extraction; Superresolution; Spatial resolution; Cameras; Task analysis; Stereoscopic image processing; lightweight transformer; super-resolution (SR); PARALLAX ATTENTION; NETWORK; MODULE;
D O I
10.1109/TETCI.2024.3436904
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inspired by the great success of the Transformer in computer vision, some works have started to explore the use of the Transformer for super-resolution (SR). However, with regard to stereoscopic SR, which aims to recover details from input pairs, how to efficiently integrate cross-view interactions into the Transformer architecture is still an ongoing development. Additionally, most existing stereoscopic SR methods only adopt a parallax mechanism in the middle of the network, and another issue is that the feature correlation from different viewpoints inevitably weakens as the network depth increases. To address these issues, we first utilize an efficient residual transformer block (ERTB) as the backbone for long-range intra-view feature extraction. Subsequently, we propose a novel multi-Dconv cross attentive block (MCAB) to enhance the cross-view interactions at the rear part of the Transformer architecture. Notably, the proposed MCAB promotes feature fusion from two viewpoints by employing bidirectional cross-attention, as opposed to an unidirectional flow from left to right or vice versa. This approach results in an efficient cross-view interaction from both branches. By leveraging the advantages of the proposed ERTB and MCAB, we introduce an efficient cross-view interaction Transformer (CVIformer) for stereoscopic SR. This architecture is capable of incorporating long-range intra-view and cross-view information with an acceptable computational overhead. Without excessive complexity, extensive experiments conducted on four public datasets demonstrate that our model achieves state-of-the-art results using only 1.17 million parameters, with approximately a 40% reduction in parameters compared to leading methods like iPASSR.
引用
收藏
页码:1107 / 1118
页数:12
相关论文
共 50 条
[21]   Cross Parallax Attention Network for Stereo Image Super-Resolution [J].
Chen, Canqiang ;
Qing, Chunmei ;
Xu, Xiangmin ;
Dickinson, Patrick .
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :202-216
[22]   Cross-View Recurrence-Based Self-Supervised Super-Resolution of Light Field [J].
Sheng, Hao ;
Wang, Sizhe ;
Yang, Da ;
Cong, Ruixuan ;
Cui, Zhenglong ;
Chen, Rongshan .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) :7252-7266
[23]   Stereoscopic Image Super-Resolution Method with View Incorporation and Convolutional Neural Networks [J].
Pan, Zhiyong ;
Jiang, Gangyi ;
Jiang, Hao ;
Yu, Mei ;
Chen, Fen ;
Zhang, Qingbo .
APPLIED SCIENCES-BASEL, 2017, 7 (06)
[24]   Light Field Image Super-Resolution With Transformers [J].
Liang, Zhengyu ;
Wang, Yingqian ;
Wang, Longguang ;
Yang, Jungang ;
Zhou, Shilin .
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 :563-567
[25]   Edge-Aware Attention Transformer for Image Super-Resolution [J].
Wang, Haoqian ;
Xing, Zhongyang ;
Xu, Zhongjie ;
Cheng, Xiangai ;
Li, Teng .
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 :2905-2909
[26]   Fusformer: A Transformer-Based Fusion Network for Hyperspectral Image Super-Resolution [J].
Hu, Jin-Fan ;
Huang, Ting-Zhu ;
Deng, Liang-Jian ;
Dou, Hong-Xia ;
Hong, Danfeng ;
Vivone, Gemine .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[27]   Dual Self-Attention Swin Transformer for Hyperspectral Image Super-Resolution [J].
Long, Yaqian ;
Wang, Xun ;
Xu, Meng ;
Zhang, Shuyu ;
Jiang, Shuguo ;
Jia, Sen .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[28]   Densely Connected Transformer With Linear Self-Attention for Lightweight Image Super-Resolution [J].
Zeng, Kun ;
Lin, Hanjiang ;
Yan, Zhiqiang ;
Fang, Jinsheng .
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[29]   HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution [J].
Zhang, Xiang ;
Zhang, Yulun ;
Yu, Fisher .
COMPUTER VISION - ECCV 2024, PT XL, 2025, 15098 :483-500
[30]   TCSR: Lightweight Transformer and CNN Interaction Network for Image Super-Resolution [J].
Cai, Danlin ;
Tan, Wenwen ;
Chen, Feiyang ;
Lou, Xinchi ;
Xiahou, Jianbin ;
Zhu, Daxin ;
Huang, Detian .
IEEE ACCESS, 2024, 12 :174782-174795