3D POSE ESTIMATION FROM MONOCULAR VIDEO WITH CAMERA-BONE ANGLE REGULARIZATION ON THE IMAGE FEATURE

被引:0
作者
Ishii, Asuka [1 ]
Ikeda, Hiroo [1 ]
机构
[1] NEC Corp Ltd, Tokyo, Japan
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年
关键词
3D pose estimation; pose estimation; monocular; regularization;
D O I
10.1109/ICASSP48485.2024.10446350
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a monocular 3D pose estimation method which explicitly takes into account the angles between the camera optical axis and bones (camera-bone angles) as well as temporal information. The proposed method combines a 2D-to-3D-based method, which predicts a 3D pose from a sequence of 2D poses, and convolutional neural network (CNN) and includes novel regularization loss to enable the CNN to extract camera-bone-angle information. The camera-bone-angle and temporal information suppress ambiguity of 2D-to-3D-based methods where the same 2D pose can be mapped to multiple 3D poses. Experiments on the Human3.6M and MPI-INF-3DHP datasets showed that the proposed method improved the performance by 5.1 mm and 2.1 mm in terms of mean per joint position error (MPJPE) respectively.
引用
收藏
页码:3740 / 3744
页数:5
相关论文
共 18 条
[1]  
[Anonymous], 2002, P ADV NEURAL INF PRO
[2]   Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations [J].
Bourdev, Lubomir ;
Malik, Jitendra .
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :1365-1372
[3]   Cascaded Pyramid Network for Multi-Person Pose Estimation [J].
Chen, Yilun ;
Wang, Zhicheng ;
Peng, Yuxiang ;
Zhang, Zhiqiang ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112
[4]   MetaLoc: Learning to Learn Indoor RSS Fingerprinting Localization over Multiple Scenarios [J].
Gao, Jun ;
Zhang, Ceyao ;
Kong, Qinglei ;
Yin, Feng ;
Xu, Lexi ;
Niu, Kai .
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, :3232-3237
[5]  
Hadsell R, 2006, 2006 IEEE COMP SOC C, V2, P1735
[6]  
Ionescu C., 2011, INT C COMP VIS
[7]   Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments [J].
Ionescu, Catalin ;
Papava, Dragos ;
Olaru, Vlad ;
Sminchisescu, Cristian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (07) :1325-1339
[8]  
KOCABAS M., 2021, Proceedings International Conference on Computer Vision (ICCV), P11035
[9]  
Martinez Julieta, 2017, P IEEE INT C COMP VI, P2640
[10]   Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB [J].
Mehta, Dushyant ;
Sotnychenko, Oleksandr ;
Mueller, Franziska ;
Xu, Weipeng ;
Sridhar, Srinath ;
Pons-Moll, Gerard ;
Theobalt, Christian .
2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, :120-130