TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation

被引：28

作者：

Chen, Hanzhi ^{[1
]}

Manhardt, Fabian ^{[2
]}

Navab, Nassir ^{[1
]}

Busam, Benjamin ^{[1
,3
]}

机构：

[1] Tech Univ Munich, Munich, Germany

[2] Google Inc, Mountain View, CA USA

[3] 3Dwe ai, Haifa, Israel

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.00469

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we introduce neural texture learning for 6D object pose estimation from synthetic data and a few unlabelled real images. Our major contribution is a novel learning scheme which removes the drawbacks of previous works, namely the strong dependency on co-modalities or additional refinement. These have been previously necessary to provide training signals for convergence. We formulate such a scheme as two sub-optimisation problems on texture learning and pose learning. We separately learn to predict realistic texture of objects from real image collections and learn pose estimation from pixel-perfect synthetic data. Combining these two capabilities allows then to synthesise photorealistic novel views to supervise the pose estimator with accurate geometry. To alleviate pose noise and segmentation imperfection present during the texture learning phase, we propose a surfel-based adversarial training loss together with texture regularisation from synthetic data. We demonstrate that the proposed approach significantly outperforms the recent state-of-the-art methods without ground-truth pose annotations and demonstrates substantial generalisation improvements towards unseen scenes. Remarkably, our scheme improves the adopted pose estimators substantially even when initialised with much inferior performance.

引用

页码：4841 / 4852

页数：12

共 60 条

[1] Speeded-Up Robust Features (SURF) [J].

Bay, Herbert ;

Ess, Andreas ;

Tuytelaars, Tinne ;

Van Gool, Luc .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359

[2] Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks [J].

Bousmalis, Konstantinos ;

Silberman, Nathan ;

Dohan, David ;

Erhan, Dumitru ;

Krishnan, Dilip .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :95-104

[3]

Brachmann E, 2014, LECT NOTES COMPUT SC, V8690, P536, DOI 10.1007/978-3-319-10605-2_35

[4]

Chen Kai, 2022, ARXIV220407049

[5] Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [J].

Chen, Xu ;

Dong, Zijian ;

Song, Jie ;

Geiger, Andreas ;

Hilliges, Otmar .

COMPUTER VISION - ECCV 2020, PT XXVI, 2020, 12371 :139-156

[6] The MOPED framework: Object recognition and pose estimation for manipulation [J].

Collet, Alvaro ;

Martinez, Manuel ;

Srinivasa, Siddhartha S. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2011, 30 (10) :1284-1306

[7] Efficient Multi-View Object Recognition and Full Pose Estimation [J].

Collet, Alvaro ;

Srinivasa, Siddhartha S. .

2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, :2050-2055

[8]

Deng XK, 2020, IEEE INT CONF ROBOT, P3665, DOI [10.1109/ICRA40945.2020.9196714, 10.1109/icra40945.2020.9196714]

[9]

Denninger Maximilian, 2019, CoRR abs/1911.01911

[10] SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation [J].

Di, Yan ;

Manhardt, Fabian ;

Wang, Gu ;

Ji, Xiangyang ;

Navab, Nassir ;

Tombari, Federico .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :12376-12385

← 1 2 3 4 5 6 →