TCCL-Net: Transformer-Convolution Collaborative Learning Network for Omnidirectional Image Super-Resolution

被引:23
作者
Chai, Xiongli [1 ]
Shao, Feng [1 ]
Jiang, Qiuping [1 ]
Ying, Hongwei [2 ]
机构
[1] Ningbo Univ, Fac Informat Sci & Engn, Ningbo 315211, Peoples R China
[2] Ningbo Univ Technol, Coll Elect & Informat Engn, Ningbo 315211, Peoples R China
基金
浙江省自然科学基金; 中国国家自然科学基金;
关键词
Omnidirectional images; Super Resolution; Swin Transformer; Collaborative learning network; Attention mechanisms;
D O I
10.1016/j.knosys.2023.110625
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As virtual reality and metaverse become more and more popular, the Omnidirectional Image (OI) has attracted extreme attention due to its immersive display characteristics. However, users only watch a portion of the content in a specific viewport extracted from a panoramic view, which will lead to a problem of resolution mismatch that requires High-Resolution (HR) for clear near-eye displays in viewports. Hence, it is necessary to exploit a Super Resolution (SR) solution for reconstructing Low-Resolution (LR) OIs. Different from 2D SR methods, the variation of pixel distributions along latitudes is a critical factor in designing an Omnidirectional Image Super-Resolution (OISR) scheme. In this paper, we put forward a novel end-to-end network with a Transformer and Convolution Collaborative Learning Network (TCCL-Net) for OISR. Firstly, Swin Transformer blocks and residual convolution blocks are employed to extract long-range and short-range dependencies, thereby digging into more rich and heterogeneous features from these two branches. Secondly, to better fuse these two features, cross-guided enhanced attention mechanisms are designed for bidirectional information enhancement onto both channel and spatial features. Thirdly, to alleviate nonuniformly pixel distributions across latitudes, we add an absolute positional encoding into Swin Transformer to represent patch weights at different positions and propose a tile-based panoramic reconstruction module to super-resolve various bands with different pixel sampling characteristics across latitudes. Experimental results on two available benchmark datasets demonstrate the superiority of the proposed approach over the state-of-the-art method in achieving OISR task. (c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 70 条
[1]   Joint Registration and Super-Resolution With Omnidirectional Images [J].
Arican, Zafer ;
Frossard, Pascal .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2011, 20 (11) :3151-3162
[2]   PLENOPTIC BASED SUPER-RESOLUTION FOR OMNIDIRECTIONAL IMAGE SEQUENCES [J].
Bagnato, Luigi ;
Boursier, Yannick ;
Frossard, Pascal ;
Vandergheynst, Pierre .
2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, :2829-2832
[3]  
Chen D., 2022, IEEE T CIRC SYST VID
[4]   Pre-Trained Image Processing Transformer [J].
Chen, Hanting ;
Wang, Yunhe ;
Guo, Tianyu ;
Xu, Chang ;
Deng, Yiping ;
Liu, Zhenhua ;
Ma, Siwei ;
Xu, Chunjing ;
Xu, Chao ;
Gao, Wen .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12294-12305
[5]  
Chen HY, 2021, Arxiv, DOI arXiv:2104.09497
[6]  
Deng X., 2022, IEEE Trans. Multimed.
[7]   LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-resolution [J].
Deng, Xin ;
Wang, Hao ;
Xu, Mai ;
Guo, Yichen ;
Song, Yuhang ;
Yang, Li .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :9185-9194
[8]   Image Super-Resolution Using Deep Convolutional Networks [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307
[9]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[10]   Toward Low-Latency and Ultra-Reliable Virtual Reality [J].
Elbamby, Mohammed S. ;
Perfecto, Cristina ;
Bennis, Mehdi ;
Doppler, Klaus .
IEEE NETWORK, 2018, 32 (02) :78-84