Deep representation learning for human motion prediction and classification

被引：279

作者：

Butepage, Judith ^{[1
]}

Black, Michael J. ^{[2
]}

Kragic, Danica ^{[1
]}

Kjellstrom, Hedvig ^{[1
]}

机构：

[1] KTH, CSC, Dept Robot Percept & Learning, Stockholm, Sweden

[2] Max Planck Inst Intelligent Syst, Perceiving Syst Dept, Tubingen, Germany

来源：

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) | 2017年

基金：

欧盟地平线“2020”;

关键词：

D O I：

10.1109/CVPR.2017.173

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generative models of 3D human motion are often restricted to a small number of activities and can therefore not generalize well to novel movements or applications. In this work we propose a deep learning framework for human motion capture data that learns a generic representation from a large corpus of motion capture data and generalizes well to new, unseen, motions. Using an encoding-decoding network that learns to predict future 3D poses from the most recent past, we extract a feature representation of human motion. Most work on deep learning for sequence prediction focuses on video and speech. Since skeletal data has a different structure, we present and evaluate different network architectures that make different assumptions about time dependencies and limb correlations. To quantify the learned features, we use the output of different layers for action classification and visualize the receptive fields of the network units. Our method outperforms the recent state of the art in skeletal motion prediction even though these use action specific training data. Our results show that deep feedforward networks, trained from a generic mocap database, can successfully be used for feature extraction from human motion data and that this representation can be used as a foundation for classification and prediction.

引用

页码：1591 / 1599

页数：9

共 24 条

[1]

Alain G, 2014, J MACH LEARN RES, V15, P3563

[2]

[Anonymous], 2014, CVPR

[3]

[Anonymous], 2006, Advances in Neural Information Processing Systems

[4] Factors of Transferability for a Generic ConvNet Representation [J].

Azizpour, Hossein ;

Razavian, Ali Sharif ;

Sullivan, Josephine ;

Maki, Atsuto ;

Carlsson, Stefan .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (09) :1790-1802

[5] Ongoing human action recognition with motion capture [J].

Barnachon, Mathieu ;

Bouakaz, Saida ;

Boufama, Boubakeur ;

Guillou, Erwan .

PATTERN RECOGNITION, 2014, 47 (01) :238-247

[6] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[7]

Byron M. Y., 2009, ADV NEURAL INFORM PR

[8]

Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714

[9] Recurrent Network Models for Human Dynamics [J].

Fragkiadaki, Katerina ;

Levine, Sergey ;

Felsen, Panna ;

Malik, Jitendra .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4346-4354

[10]

Han F, 2016, ARXIV160101006

← 1 2 3 →