Enhancing egocentric 3D pose estimation with third person views

被引：4

作者：

Dhamanaskar, Ameya ^{[1
]}

Dimiccoli, Mariella ^{[1
]}

Corona, Enric ^{[1
]}

Pumarola, Albert ^{[1
]}

Moreno-Noguer, Francesc ^{[1
]}

机构：

[1] UPC, CSIC, Inst Robot & Informat Ind, Carrer Llorens & Artigas 4-6, Barcelona 08028, Spain

来源：

PATTERN RECOGNITION | 2023年 / 138卷

关键词：

3D pose estimation; Self -supervised learning; Egocentric vision;

D O I：

10.1016/j.patcog.2023.109358

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The main technical contribution consists of leveraging high-level features linking first-and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 20 0 0 videos depicting human activities captured from both first-and third-view perspectives. We explicitly consider spatial -and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, with no need to perform any sort of domain adaptation or knowledge of camera parameters. An extensive evalu-ation demonstrates that we achieve significant improvement in egocentric 3D body pose estimation per-formance on two unconstrained datasets, over three supervised state-of-the-art approaches. The collected dataset and pre-trained model are available for research purposes.1 (c) 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )

引用

页数：11