Multi-source Deep Learning for Human Pose Estimation

被引:150
作者
Ouyang, Wanli [1 ]
Chu, Xiao [1 ]
Wang, Xiaogang [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Hong Kong, Peoples R China
来源
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2014年
关键词
D O I
10.1109/CVPR.2014.299
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual appearance score, appearance mixture type and deformation are three important information sources for human pose estimation. This paper proposes to build a multi-source deep model in order to extract non-linear representation from these different aspects of information sources. With the deep model, the global, high-order human body articulation patterns in these information sources are extracted for pose estimation. The task for estimating body locations and the task for human detection are jointly learned using a unified deep model. The proposed approach can be viewed as a post-processing of pose estimation results and can flexibly integrate with existing methods by taking their information sources as input. By extracting the non-linear representation from multiple information sources, the deep model outperforms state-of-the-art by up to 8.6 percent on three public benchmark datasets.
引用
收藏
页码:CP32 / CP32
页数:1
相关论文
共 60 条
[1]  
[Anonymous], 2007, CVPR
[2]  
[Anonymous], 2011, ICCV
[3]  
[Anonymous], 2012, BMVC
[4]  
[Anonymous], CVPR
[5]  
[Anonymous], NIPS
[6]  
[Anonymous], 2012, ECCV
[7]  
[Anonymous], 2012, ICML
[8]  
[Anonymous], 2013, ICCV
[9]  
[Anonymous], 2007, NIPS
[10]  
[Anonymous], 2013, CVPR