Unsupervised Light Field Depth Estimation via Multi-View Feature Matching With Occlusion Prediction

被引：8

作者：

Zhang, Shansi ^{[1
]}

Meng, Nan ^{[2
]}

Lam, Edmund Y. ^{[1
]}

机构：

[1] Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong, Peoples R China

[2] Univ Hong Kong, Li Ka Shing Fac Med, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 04期

关键词：

Estimation; Costs; Training; Image edge detection; Feature extraction; Convolutional neural networks; Training data; Light field; unsupervised depth estimation; feature matching; occlusion prediction; DISPARITY ESTIMATION;

D O I：

10.1109/TCSVT.2023.3305978

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Depth estimation from light field (LF) images is a fundamental step for numerous applications. Recently, learning-based methods have achieved higher accuracy and efficiency than the traditional methods. However, it is costly to obtain sufficient depth labels for supervised training. In this paper, we propose an unsupervised framework to estimate depth from LF images. First, we design a disparity estimation network (DispNet) with a coarse-to-fine structure to predict disparity maps from different view combinations. It explicitly performs multi-view feature matching to learn the correspondences effectively. As occlusions may cause the violation of photo-consistency, we introduce an occlusion prediction network (OccNet) to predict the occlusion maps, which are used as the element-wise weights of photometric loss to solve the occlusion issue and assist the disparity learning. With the disparity maps estimated by multiple input combinations, we then propose a disparity fusion strategy based on the estimated errors with effective occlusion handling to obtain the final disparity map with higher accuracy. Experimental results demonstrate that our method achieves superior performance on both the dense and sparse LF images, and also shows better robustness and generalization on the real-world LF images compared to the other methods.

引用

页码：2261 / 2273

页数：13

共 46 条

[1] Pyramid Stereo Matching Network [J].

Chang, Jia-Ren ;

Chen, Yong-Sheng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418

[2]

Chen JX, 2021, AAAI CONF ARTIF INTE, V35, P1009

[3] Accurate Light Field Depth Estimation With Superpixel Regularization Over Partially Occluded Regions [J].

Chen, Jie ;

Hou, Junhui ;

Ni, Yun ;

Chau, Lap-Pui .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (10) :4889-4900

[4] Light Field Compressed Sensing Over a Disparity-Aware Dictionary [J].

Chen, Jie ;

Chau, Lap-Pui .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (04) :855-865

[5]

Chen LC, 2017, Arxiv, DOI arXiv:1706.05587

[6]

Fiss J, 2014, IEEE INT CONF COMPUT

[7] Unsupervised Monocular Depth Estimation with Left-Right Consistency [J].

Godard, Clement ;

Mac Aodha, Oisin ;

Brostow, Gabriel J. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6602-6611

[8] Neural EPI-volume Networks for Shape from Light Field [J].

Heber, Stefan ;

Yu, Wei ;

Pock, Thomas .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2271-2279

[9] A Dataset and Evaluation Methodology for Depth Estimation on 4D Light Fields [J].

Honauer, Katrin ;

Johannsen, Ole ;

Kondermann, Daniel ;

Goldluecke, Bastian .

COMPUTER VISION - ACCV 2016, PT III, 2017, 10113 :19-34

[10] Empirical Bayesian Light-Field Stereo Matching by Robust Pseudo Random Field Modeling [J].

Huang, Chao-Tsung .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (03) :552-565

← 1 2 3 4 5 →