Understanding holistic human pose using class-specific convolutional neural network

被引：0

作者：

Faranak Shamsafar

Hossein Ebrahimnezhad

机构：

[1] Sahand University of Technology,Computer Vision Research Lab, Electrical Engineering Faculty

来源：

Multimedia Tools and Applications | 2018年 / 77卷

关键词：

Human pose estimation; Holistic pose; RGB images; Unconstrained conditions; Deep learning; Convolutional neural network;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This paper presents a method to capture human pose from individual real-world RGB images using a deep learning technique. The current works on estimating human pose by deep learning are designed in a detection or a regression framework, and in a part-based manner. As a new perspective, we introduce a classification scheme for this problem, which reasons the pose holistically. To the best of our knowledge, this is the first work for holistic human pose classification task that owes its feasibility to the great power of convolutional neural networks in feature learning. After training a convolutional neural network to classify the input image to one of the KeyPoses, the final pose is computed as a linear combination of several KeyPoses. In this new holistic classification attitude, the vast and high degree of freedom human pose space is divided into a finite number of subspaces and the convolutional neural network shows promising results in learning the features of each subspace. Empirical results (PCP and PCK rates) demonstrate that the proposed scheme is successfully able to understand human pose (i.e., predict a valid, true and coarse pose) in real-world unconstrained images with challenges like severe occlusion, high articulation, low quality and cluttered background. Furthermore, using the proposed method, the need for defining a complex model (such as appearance model or joints pairwise relations) is relieved. We have also verified a potential application of our proposed method in semantic image retrieval based on human pose.

引用

页码：23193 / 23225

页数：32

共 40 条

[1] Abdel-Hamid O(2014)Convolutional neural networks for speech recognition IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 22 1533-1545
[2] Mohamed AR(2016)CNNTracker: online discriminative object tracking via deep convolutional neural network Appl Soft Comput 38 1088-1098
[3] Jiang H(2013)Tracking generic human motion via fusion of low- and high-dimensional approaches IEEE Trans Syst Man Cybern Syst 43 996-1002
[4] Deng L(2016)From action to activity: sensor-based activity recognition Neurocomputing 181 108-115
[5] Penn G(2017)Towards unsupervised physical activity recognition using smartphone accelerometers Multimedia Tools Appl 76 10701-10719
[6] Yu D(2013)Efficient human pose estimation from single depth images IEEE Trans Pattern Anal Mach Intell (PAMI) 35 2821-2840
[7] Chen Y(2016)Robust visual tracking via convolutional networks without training IEEE Trans Image Process 25 1779-1792
[8] Yang X(2016)Spatio-temporal matching for human pose estimation in video IEEE Trans Pattern Anal Mach Intell (PAMI) 38 1492-1504
[9] Zhong B(undefined)undefined undefined undefined undefined-undefined
[10] Pan S(undefined)undefined undefined undefined undefined-undefined

← 1 2 3 4 →