CVIformer: Cross-View Interactive Transformer for Efficient Stereoscopic Image Super-Resolution

被引：0

作者：

Zhang, Dongyang ^{[1
]}

Liang, Shuang ^{[1
]}

He, Tao ^{[1
]}

Shao, Jie ^{[2
]}

Qin, Ke ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Inst Intelligent Comp, Chengdu 611731, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China

来源：

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2024年

基金：

中国国家自然科学基金;

关键词：

Transformers; Stereo image processing; Feature extraction; Superresolution; Spatial resolution; Cameras; Task analysis; Stereoscopic image processing; lightweight transformer; super-resolution (SR); PARALLAX ATTENTION; NETWORK; MODULE;

D O I：

10.1109/TETCI.2024.3436904

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Inspired by the great success of the Transformer in computer vision, some works have started to explore the use of the Transformer for super-resolution (SR). However, with regard to stereoscopic SR, which aims to recover details from input pairs, how to efficiently integrate cross-view interactions into the Transformer architecture is still an ongoing development. Additionally, most existing stereoscopic SR methods only adopt a parallax mechanism in the middle of the network, and another issue is that the feature correlation from different viewpoints inevitably weakens as the network depth increases. To address these issues, we first utilize an efficient residual transformer block (ERTB) as the backbone for long-range intra-view feature extraction. Subsequently, we propose a novel multi-Dconv cross attentive block (MCAB) to enhance the cross-view interactions at the rear part of the Transformer architecture. Notably, the proposed MCAB promotes feature fusion from two viewpoints by employing bidirectional cross-attention, as opposed to an unidirectional flow from left to right or vice versa. This approach results in an efficient cross-view interaction from both branches. By leveraging the advantages of the proposed ERTB and MCAB, we introduce an efficient cross-view interaction Transformer (CVIformer) for stereoscopic SR. This architecture is capable of incorporating long-range intra-view and cross-view information with an acceptable computational overhead. Without excessive complexity, extensive experiments conducted on four public datasets demonstrate that our model achieves state-of-the-art results using only 1.17 million parameters, with approximately a 40% reduction in parameters compared to leading methods like iPASSR.

引用

页码：1107 / 1118

页数：12

共 50 条

[1] Steformer: Efficient Stereo Image Super-Resolution With Transformer
Lin, Jianxin
Yin, Lianying
Wang, Yijun
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8396 - 8407
[2] CVGSR: Stereo image Super-Resolution with Cross-View guidance *
Chen, Wenfei
Ni, Shijia
Shao, Feng
DISPLAYS, 2024, 83
[3] Super-Resolution Reconstruction for Stereoscopic Omnidirectional Display Systems via Dynamic Convolutions and Cross-View Transformer
Chai, Xiongli
Shao, Feng
Chen, Hangwei
Mu, Baoyang
Ho, Yo-Sung
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[4] Interactformer: Interactive Transformer and CNN for Hyperspectral Image Super-Resolution
Liu, Yaoting
Hu, Jianwen
Kang, Xudong
Luo, Jing
Fan, Shaosheng
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[5] Cross View Capture for Stereo Image Super-Resolution
Zhu, Xiangyuan
Guo, Kehua
Fang, Hui
Chen, Liang
Ren, Sheng
Hu, Bin
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 3074 - 3086
[6] Recurrent Interaction Network for Stereoscopic Image Super-Resolution
Zhang, Zhe
Peng, Bo
Lei, Jianjun
Shen, Haifeng
Huang, Qingming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2048 - 2060
[7] Stereoscopic image super-resolution with interactive memory learning
Zhu, Xiangyuan
Guo, Kehua
Qiu, Tian
Fang, Hui
Wu, Zheng
Tan, Xuyang
Liu, Chao
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
[8] Efficient Swin Transformer for Remote Sensing Image Super-Resolution
Kang, Xudong
Duan, Puhong
Li, Jier
Li, Shutao
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6367 - 6379
[9] Deep Stereoscopic Image Super-Resolution via Interaction Module
Lei, Jianjun
Zhang, Zhe
Fan, Xiaoting
Yang, Bolan
Li, Xinxin
Chen, Ying
Huang, Qingming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (08) : 3051 - 3061
[10] Strong-Weak Cross-View Interaction Network for Stereo Image Super-Resolution
He, Kun
Li, Changyu
Shao, Jie
PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 550 - 554

← 1 2 3 4 5 →