Two-Stream Based Multi-Stage Hybrid Decoder for Self-Supervised Multi-Frame Monocular Depth

被引:7
作者
Long, Yangqi [1 ]
Yu, Huimin [1 ,2 ,3 ,4 ]
Liu, Biyang [1 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, ZJU League Res & Dev Ctr, Hangzhou 310027, Zhejiang, Peoples R China
[3] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou 310027, Zhejiang, Peoples R China
[4] Zhejiang Prov Key Lab Informat Proc Commun & Netw, Hangzhou, Peoples R China
关键词
Deep learning for visual perception; deep learning methods; visual learning;
D O I
10.1109/LRA.2022.3214787
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Self-supervised depth estimation has attracted a lot of attention recently due to its low cost. Despite using the self-supervision from image sequences, the current single-image based methods only infer depth from the scene information ignoring the matching information which is also important. Nevertheless, the matching information is not always reliable, especially in the texture-less and occlusion regions. Thus it would be attractive to combine the strength of single-image scene information and multi-frame matching information. In this letter, we propose a two-stream based multi-stage hybrid decoder to effectively accomplish the integration procedure. The hybrid decoder consists of two pathways for these two kinds of information respectively, and interactively fuses them. Specifically, a cost volume is built based on the scene prior to represent the matching information, and feeds back to the single-image pathway to complete the integration. To further facilitate the interactive integration, a multi-stage fusion strategy is embedded seamlessly into the hybrid decoder, resulting in more accurate depth results. Our approach outperforms the existing self-supervised methods on the KITTI and Cityscapes datasets.
引用
收藏
页码:12291 / 12298
页数:8
相关论文
共 45 条
[11]   Unsupervised Monocular Depth Estimation with Left-Right Consistency [J].
Godard, Clement ;
Mac Aodha, Oisin ;
Brostow, Gabriel J. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6602-6611
[12]   Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras [J].
Gordon, Ariel ;
Li, Hanhan ;
Jonschkowski, Rico ;
Angelova, Anelia .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8976-8985
[13]   Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching [J].
Gu, Xiaodong ;
Fan, Zhiwen ;
Zhu, Siyu ;
Dai, Zuozhuo ;
Tan, Feitong ;
Tan, Ping .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2492-2501
[14]   Learning Optical Flow, Depth, and Scene Flow Without Real-World Labels [J].
Guizilini, Vitor ;
Lee, Kuan-Hui ;
Ambrus, Rares ;
Gaidon, Adrien .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) :3491-3498
[15]   3D Packing for Self-Supervised Monocular Depth Estimation [J].
Guizilini, Vitor ;
Ambrus, Rares ;
Pillai, Sudeep ;
Raventos, Allan ;
Gaidon, Adrien .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2482-2491
[16]   DiPE: Deeper into Photometric Errors for Unsupervised Learning of Depth and Ego-motion from Monocular Videos [J].
Jiang, Hualie ;
Ding, Laiyan ;
Sun, Zhenglong ;
Huang, Rui .
2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, :10061-10067
[17]   Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume [J].
Johnston, Adrian ;
Carneiro, Gustavo .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4755-4764
[18]   End-to-End Learning of Geometry and Context for Deep Stereo Regression [J].
Kendall, Alex ;
Martirosyan, Hayk ;
Dasgupta, Saumitro ;
Henry, Peter ;
Kennedy, Ryan ;
Bachrach, Abraham ;
Bry, Adam .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :66-75
[19]  
Khot T, 2019, Arxiv, DOI arXiv:1905.02706
[20]   CoMoDA: Continuous Monocular Depth Adaptation Using Past Experiences [J].
Kuznietsov, Yevhen ;
Proesmans, Marc ;
Van Gool, Luc .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, :2906-2916