PACE: Human and Camera Motion Estimation from in-the-wild Videos

被引:1
作者
Kocabas, Muhammed [1 ,2 ,3 ]
Yuan, Ye [1 ]
Molchanov, Pavlo [1 ]
Guo, Yunrong [1 ]
Black, Michael J. [2 ]
Hilliges, Otmar [3 ]
Kautz, Jan [1 ]
Iqbal, Umar [1 ]
机构
[1] NVIDIA, Santa Clara, CA 95051 USA
[2] Max Planck Inst Intelligent Syst, Tubingen, Germany
[3] Swiss Fed Inst Technol, Zurich, Switzerland
来源
2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024 | 2024年
关键词
CAPTURE;
D O I
10.1109/3DV62453.2024.00103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method to estimate human motion in a global scene from moving cameras. This is a highly challenging task due to the coupling of human and camera motions in the video. To address this problem, we propose a joint optimization framework that disentangles human and camera motions using both foreground human motion priors and background scene features. Unlike existing methods that use SLAM as initialization, we propose to tightly integrate SLAM and human motion priors in an optimization that is inspired by bundle adjustment. Specifically, we optimize human and camera motions to match both the observed human pose and scene features. This design combines the strengths of SLAM and motion priors, which leads to significant improvements in human and camera motion estimation. We additionally introduce a motion prior that is suitable for batch optimization, making our approach significantly more efficient than existing approaches. Finally, we propose a novel synthetic dataset that enables evaluating camera motion in addition to human motion from dynamic videos. Experiments on the synthetic and real-world RICH datasets demonstrate that our approach substantially outperforms prior art in recovering both human and camera motions.
引用
收藏
页码:397 / 408
页数:12
相关论文
共 125 条
[1]  
Akhter I, 2015, PROC CVPR IEEE, P1446, DOI 10.1109/CVPR.2015.7298751
[2]   Structured Prediction Helps 3D Human Motion Modelling [J].
Aksan, Emre ;
Kaufmann, Manuel ;
Hilliges, Otmar .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7143-7152
[3]  
[Anonymous], 2020, Render People
[4]  
[Anonymous], 2022, Unreal Engine Marketplace
[5]   HP-GAN: Probabilistic 3D human motion prediction via GAN [J].
Barsoum, Emad ;
Kender, John ;
Liu, Zicheng .
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :1499-1508
[6]   3D Pictorial Structures for Multiple Human Pose Estimation [J].
Belagiannis, Vasileios ;
Amin, Sikandar ;
Andriluka, Mykhaylo ;
Schiele, Bernt ;
Navab, Nassir ;
Ilic, Slobodan .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1669-1676
[7]   NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [J].
Ben Mildenhall ;
Srinivasan, Pratul P. ;
Tancik, Matthew ;
Barron, Jonathan T. ;
Ramamoorthi, Ravi ;
Ng, Ren .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :405-421
[8]   Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image [J].
Bogo, Federica ;
Kanazawa, Angjoo ;
Lassner, Christoph ;
Gehler, Peter ;
Romero, Javier ;
Black, Michael J. .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :561-578
[9]  
Cao Zhe, 2020, ECCV, P3
[10]   Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video [J].
Choi, Hongsuk ;
Moon, Gyeongsik ;
Chang, Ju Yong ;
Lee, Kyoung Mu .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1964-1973