Unsupervised Learning of Depth Estimation and Camera Pose With Multi-Scale GANs

被引:8
|
作者
Xu, Yufan [1 ]
Wang, Yan [1 ,2 ]
Huang, Rui [1 ]
Lei, Zeyu [1 ]
Yang, Junyao [1 ]
Li, Zijian [1 ]
机构
[1] Beihang Univ, Dept Automat Sci & Elect Engn, Beijing 100191, Peoples R China
[2] Beihang Univ, Beijing Adv Innovat Ctr Big Data Based Precis Med, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Cameras; Generative adversarial networks; Unsupervised learning; Task analysis; Pose estimation; Convolution; Learning systems; MSGAN; depth estimation; camera pose; coarse-to-fine; unsupervised learning;
D O I
10.1109/TITS.2021.3093592
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Unsupervised learning methods have achieved remarkable performance in monocular depth estimation and camera pose, which mostly solve the multi-task learning problem by using their inner geometry consistency as the self-supervision signal. While most existing approaches mostly adopt the generative model to obtain the depth map prediction, so in the resolution of depth map there is room for improvement. To this end, we present our unsupervised learning architecture based on adversarial learning model, which is used for unsupervised learning of high-resolution single view depth and camera pose. Specifically, we present a multi-scale deep convolutional Generative Adversarial Network (GAN) based learning system, which consists of three networks (pose estimation network PCNN, Generator-D and Discriminator-D for depth map prediction). Furthermore, in order to generate high-resolution depth map, we propose a multi-scale GAN model (MSGAN) to decompose the hard high-quality image generation problem into more manageable sub-problems through a coarse-to-fine process. Then, we modify the overall generation architecture of GAN model by changing the down-sampling and up-sampling components to improve the quality and accuracy of the depth map prediction. Finally, in order to improve the rate of convergence, we use the Least Square Error to increase the penalty for outliers. Detailed quantitative and qualitative evaluations of the proposed framework on the KITTI dataset show that the proposed method provides better results for both pose estimation and depth recovery.
引用
收藏
页码:17039 / 17047
页数:9
相关论文
共 50 条
  • [1] Occlusion-Aware Unsupervised Light Field Depth Estimation Based on Multi-Scale GANs
    Yan, Wenbin
    Zhang, Xiaogang
    Chen, Hua
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6318 - 6333
  • [2] Multi-Scale Contrastive Learning for Human Pose Estimation
    Bao, Wenxia
    Lin, An
    Huang, Hua
    Yang, Xianjun
    Chen, Hemu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (10) : 1332 - 1341
  • [3] Multi-scale Deep CNN Network for Unsupervised Monocular Depth Estimation
    Wan Yingcai
    Fang Lijing
    Zhao Qiankun
    2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 475 - 479
  • [4] Unsupervised Learning of Depth and Camera Pose with Feature Map Warping
    Guo, Ente
    Chen, Zhifeng
    Zhou, Yanlin
    Wu, Dapeng Oliver
    SENSORS, 2021, 21 (03) : 1 - 15
  • [5] Unsupervised Learning of Camera Pose with Compositional Re-estimation
    Nabavi, Seyed Shahabeddin
    Hosseinzadeh, Mehrdad
    Fahimi, Ramin
    Wang, Yang
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 11 - 20
  • [6] GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
    Yin, Zhichao
    Shi, Jianping
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1983 - 1992
  • [7] Hand pose estimation with multi-scale network
    Zhongxu Hu
    Youmin Hu
    Bo Wu
    Jie Liu
    Dongmin Han
    Thomas Kurfess
    Applied Intelligence, 2018, 48 : 2501 - 2515
  • [8] Hand pose estimation with multi-scale network
    Hu, Zhongxu
    Hu, Youmin
    Wu, Bo
    Liu, Jie
    Han, Dongmin
    Kurfess, Thomas
    APPLIED INTELLIGENCE, 2018, 48 (08) : 2501 - 2515
  • [9] Selective Learning of Human Pose Estimation Based on Multi-Scale Convergence Network
    Liu, Wenkai
    Qin, Cuizhu
    Wu, Menglong
    Bai, Wenle
    Dong, Hongxia
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) : 1081 - 1084
  • [10] Multi-Scale Collaborative Network for Human Pose Estimation
    Guo, Chunsheng
    Zhou, Jialuo
    Du, Wenlong
    Zhang, Xuguang
    INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2019, 16 (04)