SPDET: Edge-Aware Self-Supervised Panoramic Depth Estimation Transformer With Spherical Geometry

被引：7

作者：

Zhuang, Chuanqing ^{[1
]}

Lu, Zhengda ^{[1
]}

Wang, Yiqun ^{[2
]}

Xiao, Jun ^{[1
]}

Wang, Ying ^{[1
]}

机构：

[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China

[2] Chongqing Univ, Coll Comp Sci, Chongqing 400044, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 10期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Edge-aware; monocular depth estimation; panoramic camera; pre-filtered depth-image-based rendering; self-supervision; spherical geometry;

D O I：

10.1109/TPAMI.2023.3272949

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Panoramic depth estimation has become a hot topic in 3D reconstruction techniques with its omnidirectional spatial field of view. However, panoramic RGB-D datasets are difficult to obtain due to the lack of panoramic RGB-D cameras, thus limiting the practicality of supervised panoramic depth estimation. Self-supervised learning based on RGB stereo image pairs has the potential to overcome this limitation due to its low dependence on datasets. In this work, we propose the SPDET, an edge-aware self-supervised panoramic depth estimation network that combines the transformer with a spherical geometry feature. Specifically, we first introduce the panoramic geometry feature to construct our panoramic transformer and reconstruct high-quality depth maps. Furthermore, we introduce the pre-filtered depth-image-based rendering method to synthesize the novel view image for self-supervision. Meanwhile, we design an edge-aware loss function to improve the self-supervised depth estimation for panorama images. Finally, we demonstrate the effectiveness of our SPDET with a series of comparison and ablation experiments while achieving the state-of-the-art self-supervised monocular panoramic depth estimation. Our code and models are available at https://github.com/zcq15/SPDET.

引用

页码：12474 / 12489

页数：16

共 53 条

[1]

Armeni I, 2017, arXiv

[2]

Bai J., 2022, arXiv

[3] Matterport3D: Learning from RGB-D Data in Indoor Environments [J].

Chang, Angel ;

Dai, Angela ;

Funkhouser, Thomas ;

Halber, Maciej ;

Niessner, Matthias ;

Savva, Manolis ;

Song, Shuran ;

Zeng, Andy ;

Zhang, Yinda .

PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2017, :667-676

[4]

Dosovitskiy A., 2021, INT C LEARN REPRESEN, P1

[5]

Eder M., 2019, arXiv

[6]

Eder M., 2019, P IEEE C COMP VIS PA, P1

[7] Pano Popups: Indoor 3D Reconstruction with a Plane-Aware Network [J].

Eder, Marc ;

Moulon, Pierre ;

Guan, Li .

2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, :76-84

[8] Tangent Images for Mitigating Spherical Distortion [J].

Eder, Marc ;

Shvets, Mykhailo ;

Lim, John ;

Frahm, Jan-Michael .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12423-12431

[9] Corners for Layout: End-to-End Layout Recovery From 360 Images [J].

Fernandez-Labrador, Clara ;

Facil, Jose M. ;

Perez-Yus, Alejandro ;

Demonceaux, Cedric ;

Civera, Javier ;

Guerrero, Jose J. .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :1255-1262

[10] Deep Ordinal Regression Network for Monocular Depth Estimation [J].

Fu, Huan ;

Gong, Mingming ;

Wang, Chaohui ;

Batmanghelich, Kayhan ;

Tao, Dacheng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2002-2011

← 1 2 3 4 5 6 →