3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network

被引:308
作者
Li, Sijin [1 ]
Chan, Antoni B. [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China
来源
COMPUTER VISION - ACCV 2014, PT II | 2015年 / 9004卷
关键词
D O I
10.1007/978-3-319-16808-1_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a deep convolutional neural network for 3D human pose estimation from monocular images. We train the network using two strategies: (1) a multi-task framework that jointly trains pose regression and body part detectors; (2) a pre-training strategy where the pose regressor is initialized using a network trained for body part detection. We compare our network on a large data set and achieve significant improvement over baseline methods. Human pose estimation is a structured prediction problem, i.e., the locations of each body part are highly correlated. Although we do not add constraints about the correlations between body parts to the network, we empirically show that the network has disentangled the dependencies among different body parts, and learned their correlations.
引用
收藏
页码:332 / 347
页数:16
相关论文
共 33 条
[31]   Modeling 3D Human Poses from Uncalibrated Monocular Images [J].
Wei, Xiaolin K. ;
Chai, Jinxiang .
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :1873-1880
[32]  
Yang Y, 2011, PROC CVPR IEEE, P1385, DOI 10.1109/CVPR.2011.5995741
[33]   Neural networks for the recognition and pose estimation of 3D objects from a single 2D perspective view [J].
Yuan, C ;
Niemann, H .
IMAGE AND VISION COMPUTING, 2001, 19 (9-10) :585-592