Parallax Attention for Unsupervised Stereo Correspondence Learning

被引:99
作者
Wang, Longguang [1 ]
Guo, Yulan [1 ,2 ]
Wang, Yingqian [1 ]
Liang, Zhengfa [3 ]
Lin, Zaiping [1 ]
Yang, Jungang [1 ]
An, Wei [1 ]
机构
[1] Nat Univ Def Technol NUDT, Coll Elect Sci & Technol, Changsha 410073, Peoples R China
[2] Sun Yat Sen Univ, Sch Elect & Commun Engn, Guangzhou 510275, Peoples R China
[3] Natl Key Lab Sci & Technol Blind Signal Proc, Chengdu 610041, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Three-dimensional displays; Cameras; Correlation; Aggregates; Parallax attention; stereo matching; image super-resolution; unsupervised learning; stereo correspondence; NETWORKS;
D O I
10.1109/TPAMI.2020.3026899
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stereo image pairs encode 3D scene cues into stereo correspondences between the left and right images. To exploit 3D cues within stereo images, recent CNN based methods commonly use cost volume techniques to capture stereo correspondence over large disparities. However, since disparities can vary significantly for stereo cameras with different baselines, focal lengths and resolutions, the fixed maximum disparity used in cost volume techniques hinders them to handle different stereo image pairs with large disparity variations. In this paper, we propose a generic parallax-attention mechanism (PAM) to capture stereo correspondence regardless of disparity variations. Our PAM integrates epipolar constraints with attention mechanism to calculate feature similarities along the epipolar line to capture stereo correspondence. Based on our PAM, we propose a parallax-attention stereo matching network (PASMnet) and a parallax-attention stereo image super-resolution network (PASSRnet) for stereo matching and stereo image super-resolution tasks. Moreover, we introduce a new and large-scale dataset named Flickr1024 for stereo image super-resolution. Experimental results show that our PAM is generic and can effectively learn stereo correspondence under large disparity variations in an unsupervised manner. Comparative results show that our PASMnet and PASSRnet achieve the state-of-the-art performance.
引用
收藏
页码:2108 / 2125
页数:18
相关论文
共 70 条
[1]   NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study [J].
Agustsson, Eirikur ;
Timofte, Radu .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1122-1131
[2]  
Ahmadi A, 2016, IEEE IMAGE PROC, P1629, DOI 10.1109/ICIP.2016.7532634
[3]   Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network [J].
Ahn, Namhyuk ;
Kang, Byungkon ;
Sohn, Kyung-Ah .
COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :256-272
[4]  
[Anonymous], 2015, P INT C LEARN REPR
[5]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[6]   Resolution Enhancement in Multi-Image Stereo [J].
Bhavsar, Arnav V. ;
Rajagopalan, A. N. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1721-1728
[7]   Pyramid Stereo Matching Network [J].
Chang, Jia-Ren ;
Chen, Yong-Sheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418
[8]   Stereoscopic Neural Style Transfer [J].
Chen, Dongdong ;
Yuan, Lu ;
Liao, Jing ;
Yu, Nenghai ;
Hua, Gang .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6654-6663
[9]   SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J].
Chen, Long ;
Zhang, Hanwang ;
Xiao, Jun ;
Nie, Liqiang ;
Shao, Jian ;
Liu, Wei ;
Chua, Tat-Seng .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6298-6306
[10]   Learning a Deep Convolutional Network for Image Super-Resolution [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 :184-199