PACE: Human and Camera Motion Estimation from in-the-wild Videos

被引:1
作者
Kocabas, Muhammed [1 ,2 ,3 ]
Yuan, Ye [1 ]
Molchanov, Pavlo [1 ]
Guo, Yunrong [1 ]
Black, Michael J. [2 ]
Hilliges, Otmar [3 ]
Kautz, Jan [1 ]
Iqbal, Umar [1 ]
机构
[1] NVIDIA, Santa Clara, CA 95051 USA
[2] Max Planck Inst Intelligent Syst, Tubingen, Germany
[3] Swiss Fed Inst Technol, Zurich, Switzerland
来源
2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024 | 2024年
关键词
CAPTURE;
D O I
10.1109/3DV62453.2024.00103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method to estimate human motion in a global scene from moving cameras. This is a highly challenging task due to the coupling of human and camera motions in the video. To address this problem, we propose a joint optimization framework that disentangles human and camera motions using both foreground human motion priors and background scene features. Unlike existing methods that use SLAM as initialization, we propose to tightly integrate SLAM and human motion priors in an optimization that is inspired by bundle adjustment. Specifically, we optimize human and camera motions to match both the observed human pose and scene features. This design combines the strengths of SLAM and motion priors, which leads to significant improvements in human and camera motion estimation. We additionally introduce a motion prior that is suitable for batch optimization, making our approach significantly more efficient than existing approaches. Finally, we propose a novel synthetic dataset that enables evaluating camera motion in addition to human motion from dynamic videos. Experiments on the synthetic and real-world RICH datasets demonstrate that our approach substantially outperforms prior art in recovering both human and camera motions.
引用
收藏
页码:397 / 408
页数:12
相关论文
共 125 条
[41]   Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies [J].
Joo, Hanbyul ;
Simon, Tomas ;
Sheikh, Yaser .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8320-8329
[42]   Learning 3D Human Dynamics from Video [J].
Kanazawa, Angjoo ;
Zhang, Jason Y. ;
Felsen, Panna ;
Malik, Jitendra .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5597-5606
[43]   End-to-end Recovery of Human Shape and Pose [J].
Kanazawa, Angjoo ;
Black, Michael J. ;
Jacobs, David W. ;
Malik, Jitendra .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7122-7131
[44]   Convolutional Autoencoders for Human Motion Infilling [J].
Kaufmann, Manuel ;
Aksan, Emre ;
Song, Jie ;
Pece, Fabrizio ;
Ziegler, Remo ;
Hilliges, Otmar .
2020 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2020), 2020, :918-927
[45]  
Khurana Tarasha, 2021, ICCV, P3174
[46]   SPEC: Seeing People in the Wild with an Estimated Camera [J].
Kocabas, Muhammed ;
Huang, Chun-Hao P. ;
Tesch, Joachim ;
Mueller, Lea ;
Hilliges, Otmar ;
Black, Michael J. .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :11015-11025
[47]   VIBE: Video Inference for Human Body Pose and Shape Estimation [J].
Kocabas, Muhammed ;
Athanasiou, Nikos ;
Black, Michael J. .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :5252-5262
[48]   Probabilistic Modeling for Human Mesh Recovery [J].
Kolotouros, Nikos ;
Pavlakos, Georgios ;
Jayaraman, Dinesh ;
Daniilidis, Kostas .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :11585-11594
[49]   Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop [J].
Kolotouros, Nikos ;
Pavlakos, Georgios ;
Black, Michael J. ;
Daniilidis, Kostas .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :2252-2261
[50]   Convolutional Mesh Regression for Single-Image Human Shape Reconstruction [J].
Kolotouros, Nikos ;
Pavlakos, Georgios ;
Daniilidis, Kostas .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4496-4505