Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors

被引：87

作者：

Guzov, Vladimir ^{[1
,2
]}

Mir, Aymen ^{[1
,2
]}

Sattler, Torsten ^{[3
]}

Pons-Moll, Gerard ^{[1
,2
]}

机构：

[1] Univ Tubingen, Tubingen, Germany

[2] Max Planck Inst Informat, Saarbrucken, Germany

[3] Czech Tech Univ, CIIRC, Prague, Czech Republic

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

基金：

欧盟地平线“2020”;

关键词：

PEOPLE;

D O I：

10.1109/CVPR46437.2021.00430

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment using wearable sensors. Using IMUs attached at the body limbs and a head mounted camera looking outwards, HPS fuses camera based self-localization with IMU-based human body tracking. The former provides drift-free but noisy position and orientation estimates while the latter is accurate in the short-term but subject to drift over longer periods of time. We show that our optimization-based integration exploits the benefits of the two, resulting in pose accuracy free of drift. Furthermore, we integrate 3D scene constraints into our optimization, such as foot contact with the ground, resulting in physically plausible motion. HPS complements more common third-person-based 3D pose estimation methods. It allows capturing larger recording volumes and longer periods of motion, and could be used for VR/AR applications where humans interact with the scene without requiring direct line of sight with an external camera, or to train agents that navigate and interact with the environment based on first-person visual input, like real humans. With HPS, we recorded a dataset of humans interacting with large 3D scenes (300-1000 m(2)) consisting of 7 subjects and more than 3 hours of diverse motion. The dataset, code and video will be available on the project page: http://virtualhumans.mpi-inf.mpg.de/hps.

引用

页码：4316 / 4327

页数：12

共 83 条

[1] Tex2Shape: Detailed Full Human Body Geometry From a Single Image [J].

Alldieck, Thiemo ;

Pons-Moll, Gerard ;

Theobalt, Christian ;

Magnor, Marcus .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :2293-2303

[2]

[Anonymous], 2018, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2018.00752

[3]

[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00499

[4]

[Anonymous], 2019, P IEEE C COMP VIS CO, DOI DOI 10.1145/3290605.3300674

[5]

[Anonymous], 2017, P IEEE C COMP VIS PA, DOI DOI 10.1109/FG.2017.75

[6]

Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]

[7] Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction [J].

Bhatnagar, Bharat Lal ;

Sminchisescu, Cristian ;

Theobalt, Christian ;

Pons-Moll, Gerard .

COMPUTER VISION - ECCV 2020, PT II, 2020, 12347 :311-329

[8]

Bhatnagar BL, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1447

[9] Multi-Garment Net: Learning to Dress 3D People from Images [J].

Bhatnagar, Bharat Lal ;

Tiwari, Garvita ;

Theobalt, Christian ;

Pons-Moll, Gerard .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5419-5429

[10] CodeSLAM-Learning a Compact, Optimisable Representation for Dense Visual SLAM [J].

Bloesch, Michael ;

Czarnowski, Jan ;

Clark, Ronald ;

Leutenegger, Stefan ;

Davison, Andrew J. .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2560-2568

← 1 2 3 4 5 6 7 8 9 →