Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement

被引：54

作者：

Nie, Qiang ^{[1
,2
]}

Liu, Ziwei ^{[1
]}

Liu, Yunhui ^{[1
,2
]}

机构：

[1] Chinese Univ Hong Kong, Shatin, Hong Kong, Peoples R China

[2] CUHK, Stone Robot Inst, Shatin, Hong Kong, Peoples R China

来源：

COMPUTER VISION - ECCV 2020, PT XIX | 2020年 / 12364卷

关键词：

Representation learning; 3D human pose; Pose denoising; Unsupervised action recognition; ACTION RECOGNITION;

D O I：

10.1007/978-3-030-58529-7_7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning a good 3D human pose representation is important for human pose related tasks, e.g. human 3D pose estimation and action recognition. Within all these problems, preserving the intrinsic pose information and adapting to view variations are two critical issues. In this work, we propose a novel Siamese denoising autoencoder to learn a 3D pose representation by disentangling the pose-dependent and view-dependent feature from the human skeleton data, in a fully unsupervised manner. These two disentangled features are utilized together as the representation of the 3D pose. To consider both the kinematic and geometric dependencies, a sequential bidirectional recursive network (SeBiReNet) is further proposed to model the human skeleton data. Extensive experiments demonstrate that the learned representation 1) preserves the intrinsic information of human pose, 2) shows good transferability across datasets and tasks. Notably, our approach achieves state-of-the-art performance on two inherently different tasks: pose denoising and unsupervised action recognition. Code and models are available at: https:// github.com/NIEQiang001/unsupervised-human-pose.git.

引用

页码：102 / 118

页数：17

共 37 条

[1] Learning Character-Agnostic Motion for Motion Retargeting in 2D [J].

Aberman, Kfir ;

Wu, Rundi ;

Lischinski, Dani ;

Chen, Baoquan ;

Cohen-Or, Daniel .

ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04)

[2]

[Anonymous], 2008, P 25 INT C MACHINE L

[3]

[Anonymous], 2010, 2010 DEEP LEARN UNS, P1

[4] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[5] Pose Encoding for Robust Skeleton-Based Action Recognition [J].

Demisse, Girum G. ;

Papadopoulos, Konstantinos ;

Aouada, Djamila ;

Ottersten, Bjorn .

PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :301-307

[6] Tensor-based linear dynamical systems for action recognition from 3D skeletons [J].

Ding, Wenwen ;

Liu, Kai ;

Belyaev, Evgeny ;

Cheng, Fei .

PATTERN RECOGNITION, 2018, 77 :75-86

[7]

Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714

[8] Deep Learning on Lie Groups for Skeleton-based Action Recognition [J].

Huang, Zhiwu ;

Wan, Chengde ;

Probst, Thomas ;

Van Gool, Luc .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1243-1252

[9]

Irsoy O, 2014, ADV NEUR IN, V27

[10] Unsupervised Feature Learning of Human Actions as Trajectories in Pose Embedding Manifold [J].

Kundu, Jogendra Nath ;

Gor, Maharshi ;

Uppala, Phani Krishna ;

Babu, R. Venkatesh .

2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1459-1467

← 1 2 3 4 →