Unsupervised Learning of Depth Estimation and Camera Pose With Multi-Scale GANs

被引：8

作者：

Xu, Yufan ^{[1
]}

Wang, Yan ^{[1
,2
]}

Huang, Rui ^{[1
]}

Lei, Zeyu ^{[1
]}

Yang, Junyao ^{[1
]}

Li, Zijian ^{[1
]}

机构：

[1] Beihang Univ, Dept Automat Sci & Elect Engn, Beijing 100191, Peoples R China

[2] Beihang Univ, Beijing Adv Innovat Ctr Big Data Based Precis Med, Beijing 100191, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2022年 / 23卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Cameras; Generative adversarial networks; Unsupervised learning; Task analysis; Pose estimation; Convolution; Learning systems; MSGAN; depth estimation; camera pose; coarse-to-fine; unsupervised learning;

D O I：

10.1109/TITS.2021.3093592

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Unsupervised learning methods have achieved remarkable performance in monocular depth estimation and camera pose, which mostly solve the multi-task learning problem by using their inner geometry consistency as the self-supervision signal. While most existing approaches mostly adopt the generative model to obtain the depth map prediction, so in the resolution of depth map there is room for improvement. To this end, we present our unsupervised learning architecture based on adversarial learning model, which is used for unsupervised learning of high-resolution single view depth and camera pose. Specifically, we present a multi-scale deep convolutional Generative Adversarial Network (GAN) based learning system, which consists of three networks (pose estimation network PCNN, Generator-D and Discriminator-D for depth map prediction). Furthermore, in order to generate high-resolution depth map, we propose a multi-scale GAN model (MSGAN) to decompose the hard high-quality image generation problem into more manageable sub-problems through a coarse-to-fine process. Then, we modify the overall generation architecture of GAN model by changing the down-sampling and up-sampling components to improve the quality and accuracy of the depth map prediction. Finally, in order to improve the rate of convergence, we use the Least Square Error to increase the penalty for outliers. Detailed quantitative and qualitative evaluations of the proposed framework on the KITTI dataset show that the proposed method provides better results for both pose estimation and depth recovery.

引用

页码：17039 / 17047

页数：9

共 50 条

[1] Occlusion-Aware Unsupervised Light Field Depth Estimation Based on Multi-Scale GANs
Yan, Wenbin
Zhang, Xiaogang
Chen, Hua
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6318 - 6333
[2] Multi-Scale Contrastive Learning for Human Pose Estimation
Bao, Wenxia
Lin, An
Huang, Hua
Yang, Xianjun
Chen, Hemu
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (10) : 1332 - 1341
[3] Multi-scale Deep CNN Network for Unsupervised Monocular Depth Estimation
Wan Yingcai
Fang Lijing
Zhao Qiankun
2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 475 - 479
[4] Unsupervised Learning of Depth and Camera Pose with Feature Map Warping
Guo, Ente
Chen, Zhifeng
Zhou, Yanlin
Wu, Dapeng Oliver
SENSORS, 2021, 21 (03) : 1 - 15
[5] Unsupervised Learning of Camera Pose with Compositional Re-estimation
Nabavi, Seyed Shahabeddin
Hosseinzadeh, Mehrdad
Fahimi, Ramin
Wang, Yang
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 11 - 20
[6] GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
Yin, Zhichao
Shi, Jianping
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1983 - 1992
[7] Hand pose estimation with multi-scale network
Zhongxu Hu
Youmin Hu
Bo Wu
Jie Liu
Dongmin Han
Thomas Kurfess
Applied Intelligence, 2018, 48 : 2501 - 2515
[8] Hand pose estimation with multi-scale network
Hu, Zhongxu
Hu, Youmin
Wu, Bo
Liu, Jie
Han, Dongmin
Kurfess, Thomas
APPLIED INTELLIGENCE, 2018, 48 (08) : 2501 - 2515
[9] Selective Learning of Human Pose Estimation Based on Multi-Scale Convergence Network
Liu, Wenkai
Qin, Cuizhu
Wu, Menglong
Bai, Wenle
Dong, Hongxia
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) : 1081 - 1084
[10] Multi-Scale Collaborative Network for Human Pose Estimation
Guo, Chunsheng
Zhou, Jialuo
Du, Wenlong
Zhang, Xuguang
INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2019, 16 (04)

← 1 2 3 4 5 →