3D POSE ESTIMATION FROM MONOCULAR VIDEO WITH CAMERA-BONE ANGLE REGULARIZATION ON THE IMAGE FEATURE

被引：0

作者：

Ishii, Asuka ^{[1
]}

Ikeda, Hiroo ^{[1
]}

机构：

[1] NEC Corp Ltd, Tokyo, Japan

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年

关键词：

3D pose estimation; pose estimation; monocular; regularization;

D O I：

10.1109/ICASSP48485.2024.10446350

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a monocular 3D pose estimation method which explicitly takes into account the angles between the camera optical axis and bones (camera-bone angles) as well as temporal information. The proposed method combines a 2D-to-3D-based method, which predicts a 3D pose from a sequence of 2D poses, and convolutional neural network (CNN) and includes novel regularization loss to enable the CNN to extract camera-bone-angle information. The camera-bone-angle and temporal information suppress ambiguity of 2D-to-3D-based methods where the same 2D pose can be mapped to multiple 3D poses. Experiments on the Human3.6M and MPI-INF-3DHP datasets showed that the proposed method improved the performance by 5.1 mm and 2.1 mm in terms of mean per joint position error (MPJPE) respectively.

引用

页码：3740 / 3744

页数：5

共 18 条

[1]

[Anonymous], 2002, P ADV NEURAL INF PRO

[2] Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations [J].

Bourdev, Lubomir ;

Malik, Jitendra .

2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :1365-1372

[3] Cascaded Pyramid Network for Multi-Person Pose Estimation [J].

Chen, Yilun ;

Wang, Zhicheng ;

Peng, Yuxiang ;

Zhang, Zhiqiang ;

Yu, Gang ;

Sun, Jian .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112

[4] MetaLoc: Learning to Learn Indoor RSS Fingerprinting Localization over Multiple Scenarios [J].

Gao, Jun ;

Zhang, Ceyao ;

Kong, Qinglei ;

Yin, Feng ;

Xu, Lexi ;

Niu, Kai .

IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, :3232-3237

[5]

Hadsell R, 2006, 2006 IEEE COMP SOC C, V2, P1735

[6]

Ionescu C., 2011, INT C COMP VIS

[7] Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments [J].

Ionescu, Catalin ;

Papava, Dragos ;

Olaru, Vlad ;

Sminchisescu, Cristian .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (07) :1325-1339

[8]

KOCABAS M., 2021, Proceedings International Conference on Computer Vision (ICCV), P11035

[9]

Martinez Julieta, 2017, P IEEE INT C COMP VI, P2640

[10] Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB [J].

Mehta, Dushyant ;

Sotnychenko, Oleksandr ;

Mueller, Franziska ;

Xu, Weipeng ;

Sridhar, Srinath ;

Pons-Moll, Gerard ;

Theobalt, Christian .

2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, :120-130

← 1 2 →