Self-Supervised Pretraining for Stereoscopic Image Super-Resolution With Parallax-Aware Masking

被引:6
作者
Zhang, Zhe [1 ]
Lei, Jianjun [1 ]
Peng, Bo [1 ]
Zhu, Jie [1 ]
Huang, Qingming [2 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Engn, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-supervised pretraining; super-resolution; stereoscopic image; RESOLUTION; ATTENTION;
D O I
10.1109/TBC.2024.3382960
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Most existing learning-based methods for stereoscopic image super-resolution rely on a great number of high-resolution stereoscopic images as labels. To alleviate the problem of data dependency, this paper proposes a self-supervised pretraining-based method for stereoscopic image super-resolution (SelfSSR). Specifically, to develop a self-supervised pretext task for stereoscopic images, a parallax-aware masking strategy (PAMS) is designed to adaptively mask matching areas of the left and right views. With PAMS, the network is encouraged to effectively predict missing information of input images. Besides, a cross-view Transformer module (CVTM) is presented to aggregate the intra-view and inter-view information simultaneously for stereoscopic image reconstruction. Meanwhile, the cross-attention map learned by CVTM is utilized to guide the masking process in PAMS. Comparative results on four datasets show that the proposed SelfSSR achieves state-of-the-art performance by using only 10% of labeled training data.
引用
收藏
页码:482 / 491
页数:10
相关论文
共 55 条
[1]   Point-Level Region Contrast for Object Detection Pre-Training [J].
Bai, Yutong ;
Chen, Xinlei ;
Kirillov, Alexander ;
Yuille, Alan ;
Berg, Alexander C. .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :16040-16049
[2]   STORM-GAN: Spatio-Temporal Meta-GAN for Cross-City Estimation of Human Mobility Responses to COVID- [J].
Bao, Han ;
Zhou, Xun ;
Xie, Yiqun ;
Li, Yanhua ;
Jia, Xiaowei .
2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, :1-10
[3]  
Bhavsar AV, 2008, INT C PATT RECOG, P1102
[4]   Pyramid Stereo Matching Network [J].
Chang, Jia-Ren ;
Chen, Yong-Sheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418
[5]  
Chen M, 2020, PR MACH LEARN RES, V119
[6]   Shot Contrastive Self-Supervised Learning for Scene Boundary Detection [J].
Chen, Shixing ;
Nie, Xiaohan ;
Fan, David ;
Zhang, Dongqing ;
Bhat, Vimal ;
Hamid, Raffay .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :9791-9800
[7]   Activating More Pixels in Image Super-Resolution Transformer [J].
Chen, Xiangyu ;
Wang, Xintao ;
Zhou, Jiantao ;
Qiao, Yu ;
Dong, Chao .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :22367-22377
[8]   NAFSSR: Stereo Image Super-Resolution Using NAFNet [J].
Chu, Xiaojie ;
Chen, Liangyu ;
Yu, Wenqing .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :1238-1247
[9]   Second-order Attention Network for Single Image Super-Resolution [J].
Dai, Tao ;
Cai, Jianrui ;
Zhang, Yongbing ;
Xia, Shu-Tao ;
Zhang, Lei .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11057-11066
[10]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171