GAN-Poser: an improvised bidirectional GAN model for human motion prediction

被引：23

作者：

Jain, Deepak Kumar ^{[1
]}

Zareapoor, Masoumeh ^{[2
]}

Jain, Rachna ^{[3
]}

Kathuria, Abhishek ^{[3
]}

Bachhety, Shivam ^{[3
]}

机构：

[1] Chongqing Univ Posts & Telecommun, Key Lab Intelligent Air Ground Cooperat Control U, Coll Automat, Chongqing, Peoples R China

[2] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai, Peoples R China

[3] Bharati Vidyapeeths Coll Engn, Dept Comp Sci & Engn, New Delhi, India

来源：

NEURAL COMPUTING & APPLICATIONS | 2020年 / 32卷 / 18期

关键词：

Human motion; GAN; Probability theory; Pose estimation; Sequence model; 3D model; NEURAL-NETWORKS; 3D;

D O I：

10.1007/s00521-020-04941-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A novel method called GAN-Poser has been explored to predict human motion in less time given an input 3D human skeleton sequence based on a generator-discriminator framework. Specifically, rather than using the conventional Euclidean loss, a frame-wise geodesic loss is used for geometrically meaningful and more precise distance measurement. In this paper, we have used a bidirectional GAN framework along with a recursive prediction strategy to avoid mode-collapse and to further regularize the training. To be able to generate multiple probable human-pose sequences conditioned on a given starting sequence, a random extrinsic factor Theta has also been introduced. The discriminator is trained in order to regress the extrinsic factor Theta, which is used alongside with the intrinsic factor (encoded starting pose sequence) to generate a particular pose sequence. In spite of being in a probabilistic framework, the modified discriminator architecture allows predictions of an intermediate part of pose sequence to be used as conditioning for prediction of the latter part of the sequence. This adversarial learning-based model takes into consideration of the stochasticity, and the bidirectional setup provides a new direction to evaluate the prediction quality against a given test sequence. Our resulting novel method, GAN-Poser, achieves superior performance over the state-of-the-art deep learning approaches when evaluated on the standard NTU-RGB-D and Human3.6 M dataset.

引用

页码：14579 / 14591

页数：13

共 48 条

[1] Akhter I, 2015, PROC CVPR IEEE, P1446, DOI 10.1109/CVPR.2015.7298751
[2] [Anonymous], ARXIV170604124
[3] Arjovsky Martin, 2017, ARXIV170107875
[4] BACCOUCHE M, 2011, HUMAN BEHAV UNDERSTA, P16
[5] HP-GAN: Probabilistic 3D human motion prediction via GAN
Barsoum, Emad
Kender, John
Liu, Zicheng
[J]. PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1499 - 1508
[6] Learning Deep Architectures for AI
Bengio, Yoshua
[J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01): : 1 - 127
[7] Berglund Mathias, 2015, ADV NEURAL INFORM PR, P856
[8] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
Bitzer, Sebastian
Kiebel, Stefan J.
[J]. BIOLOGICAL CYBERNETICS, 2012, 106 (4-5) : 201 - 217
[9] Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image
Bogo, Federica
Kanazawa, Angjoo
Lassner, Christoph
Gehler, Peter
Romero, Javier
Black, Michael J.
[J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 561 - 578
[10] Kullback-Leibler Divergence Between Multivariate Generalized Gaussian Distributions
Bouhlel, Nizar
Dziri, Ali
[J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (07) : 1021 - 1025

← 1 2 3 4 5 →