Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview

被引:37
作者
Fan, Zhaoxin [1 ]
Zhu, Yazhi [2 ]
He, Yulin [1 ]
Sun, Qi [1 ]
Liu, Hongyan [3 ]
He, Jun [1 ]
机构
[1] Renmin Univ China, Sch Informat, Key Lab Data Engn & Knowledge Engn MOE, 59 Zhongguancun St, Beijing 100872, Peoples R China
[2] Beijing Jiaotong Univ, Inst Informat Sci, 3 Shangyuancun, Beijing, Peoples R China
[3] Tsinghua Univ, Sch Econ & Management, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Object pose detection; object pose tracking; instance-level; category-level; monocular; AUGMENTED REALITY;
D O I
10.1145/3524496
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Object pose detection and tracking has recently attracted increasing attention due to its wide applications in many areas, such as autonomous driving, robotics, and augmented reality. Among methods for object pose detection and tracking, deep learning is the most promising one that has shown better performance than others. However, survey study about the latest development of deep learning-based methods is lacking. Therefore, this study presents a comprehensive review of recent progress in object pose detection and tracking that belongs to the deep learning technical route. To achieve a more thorough introduction, the scope of this study is limited to methods taking monocular RGB/RGBD data as input and covering three kinds of major tasks: instance-level monocular object pose detection, category-level monocular object pose detection, and monocular object pose tracking. In our work, metrics, datasets, and methods of both detection and tracking are presented in detail. Comparative results of current state-of-the-art methods on several publicly available datasets are also presented, together with insightful observations and inspiring future research directions.
引用
收藏
页数:40
相关论文
共 179 条
[1]  
Ahmadyan A, 2020, Arxiv, DOI [arXiv:2006.13194, DOI 10.48550/ARXIV.2006.13194]
[2]   Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations [J].
Ahmadyan, Adel ;
Zhang, Liangkai ;
Ablavatski, Artsiom ;
Wei, Jianing ;
Grundmann, Matthias .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7818-7827
[3]   A Survey on 3D Object Detection Methods for Autonomous Driving Applications [J].
Arnold, Eduardo ;
Al-Jarrah, Omar Y. ;
Dianati, Mehrdad ;
Fallah, Saber ;
Oxtoby, David ;
Mouzakitis, Alex .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (10) :3782-3795
[4]   Pose Guided RGBD Feature Learning for 3D Object Pose Estimation [J].
Balntas, Vassileios ;
Doumanoglou, Andreas ;
Sahin, Caner ;
Sock, Juil ;
Kouskouridas, Rigas ;
Kim, Tae-Kyun .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3876-3884
[5]  
Bousmalis K, 2018, IEEE INT CONF ROBOT, P4243
[6]   Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image [J].
Brachmann, Eric ;
Michel, Frank ;
Krull, Alexander ;
Yang, Michael Ying ;
Gumhold, Stefan ;
Rother, Carsten .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3364-3372
[7]  
Brachmann E, 2014, LECT NOTES COMPUT SC, V8690, P536, DOI 10.1007/978-3-319-10605-2_35
[8]  
Brazil Garrick, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12368), P135, DOI 10.1007/978-3-030-58592-1_9
[9]   M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [J].
Brazil, Garrick ;
Liu, Xiaoming .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9286-9295
[10]  
Bukschat Y, 2020, Arxiv, DOI [arXiv:2011.04307, DOI 10.48550/ARXIV.2011.04307]