Two-Phase Approach for Monocular Object Detection and 6-DoF Pose Estimation

被引：1

作者：

Jang, Jae-hoon ^{[1
]}

Lee, Jungyoon ^{[2
]}

Kim, Seong-heum ^{[1
]}

机构：

[1] Soongsil Univ, Coll Informat Technol, Sch AI Convergence, Seoul, South Korea

[2] Soongsil Univ, Dept Intelligent Syst, Seoul, South Korea

来源：

JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY | 2024年 / 19卷 / 03期

关键词：

Deep learning; Object detection; 6-DoF pose estimation; Perspective-n-point (PnP) algorithm;

D O I：

10.1007/s42835-023-01640-7

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We present a two-phase algorithm that first identifies the categories and 2D proposal regions of 3D objects and then estimates the eight corners of cubes bounding the target objects. Given the predicted corners, the six-degrees-of-freedom (6-DoF) poses of the 3D objects are calculated using the conventional perspective-n-point (PnP) algorithm and evaluated with respect to manually annotated corners. In addition, several 3D models with high-quality shapes, texture information, 2D images, and annotations, such as 2D boxes, 3D cuboids, and segmentation masks, are collected. New objects are included while validating the proposed method. Our results are compared qualitatively and quantitatively with those of the baseline model using the publicly accessible LineMOD dataset, additional annotations in the OCCLUSION dataset, and our own custom dataset. While handling single and multiple objects in testing scenes, the proposed method is observed to exhibit clear improvements on both the aforementioned datasets and in real-world examples.

引用

页码：1817 / 1825

页数：9

共 24 条

[1] Brachmann E, 2014, LECT NOTES COMPUT SC, V8690, P536, DOI 10.1007/978-3-319-10605-2_35
[2] OMNI3D: A Large Benchmark and Model for 3D Object Detection in the Wild
Brazil, Garrick
Kumar, Abhinav
Straub, Julian
Ravi, Nikhila
Johnson, Justin
Gkioxari, Georgia
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13154 - 13164
[3] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
Chen, Hansheng
Wang, Pichao
Wang, Fan
Tian, Wei
Xiong, Lu
Li, Hao
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2771 - 2780
[4] ABO: Dataset and Benchmarks for Real-World 3D Object Understanding
Collins, Jasmine
Goel, Shubham
Deng, Kenan
Luthra, Achleshwar
Xu, Leon
Gundogdu, Erhan
Zhang, Xi
Vicente, Tomas F. Yago
Dideriksen, Thomas
Arora, Himanshu
Guillaumin, Matthieu
Malik, Jitendra
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 21094 - 21104
[5] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6] Rich feature hierarchies for accurate object detection and semantic segmentation
Girshick, Ross
Donahue, Jeff
Darrell, Trevor
Malik, Jitendra
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
[7] Hinterstoisser S, 2013, P 11 AS C COMP VIS D, P548
[8] Hinterstoisser S, 2011, IEEE I CONF COMP VIS, P858, DOI 10.1109/ICCV.2011.6126326
[9] SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again
Kehl, Wadim
Manhardt, Fabian
Tombari, Federico
Ilic, Slobodan
Navab, Nassir
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1530 - 1538
[10] Deep learning based object detection method and its application for intelligent transport systems
Kim J.-Y.
Kim S.-H.
[J]. Journal of Institute of Control, Robotics and Systems, 2021, 27 (12) : 1016 - 1022

← 1 2 3 →