Enhancing egocentric 3D pose estimation with third person views

被引:4
作者
Dhamanaskar, Ameya [1 ]
Dimiccoli, Mariella [1 ]
Corona, Enric [1 ]
Pumarola, Albert [1 ]
Moreno-Noguer, Francesc [1 ]
机构
[1] UPC, CSIC, Inst Robot & Informat Ind, Carrer Llorens & Artigas 4-6, Barcelona 08028, Spain
关键词
3D pose estimation; Self -supervised learning; Egocentric vision;
D O I
10.1016/j.patcog.2023.109358
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The main technical contribution consists of leveraging high-level features linking first-and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 20 0 0 videos depicting human activities captured from both first-and third-view perspectives. We explicitly consider spatial -and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, with no need to perform any sort of domain adaptation or knowledge of camera parameters. An extensive evalu-ation demonstrates that we achieve significant improvement in egocentric 3D body pose estimation per-formance on two unconstrained datasets, over three supervised state-of-the-art approaches. The collected dataset and pre-trained model are available for research purposes.1 (c) 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
引用
收藏
页数:11
相关论文
empty
未找到相关数据