View Transfer on Human Skeleton Pose: Automatically Disentangle the View-Variant and View-Invariant Information for Pose Representation Learning

被引:10
|
作者
Nie, Qiang [1 ]
Liu, Yunhui [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
关键词
Representation learning; Human skeleton pose; View transfer; Unsupervised action recognition; ACTION RECOGNITION;
D O I
10.1007/s11263-020-01354-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning a good pose representation is significant for many applications, such as human pose estimation and action recognition. However, the representations learned by most approaches are not intrinsic and their transferability in different datasets and different tasks is limited. In this paper, we introduce a method to learn a versatile representation, which is capable of recovering unseen corrupted skeletons, being applied to the human action recognition, and transferring pose from one view to another view without knowing the relationships of cameras. To this end, a sequential bidirectional recursive network (SeBiReNet) is proposed for modeling kinematic dependency between skeleton joints. Utilizing the SeBiReNet as the core module, a denoising autoencoder is designed to learn intrinsic pose features through the task of recovering corrupted skeletons. Instead of only extracting the view-invariant feature as many other methods, we disentangle the view-invariant feature from the view-variant feature in the latent space and use them together as a representation of the human pose. For a better feature disentanglement, an adversarial augmentation strategy is proposed and applied to the denoising autoencoder. Disentanglement of view-variant and view-invariant features enables us to realize view transfer on 3D poses. Extensive experiments on different datasets and different tasks verify the effectiveness and versatility of the learned representation.
引用
收藏
页码:1 / 22
页数:22
相关论文
共 50 条
  • [1] View Transfer on Human Skeleton Pose: Automatically Disentangle the View-Variant and View-Invariant Information for Pose Representation Learning
    Qiang Nie
    Yunhui Liu
    International Journal of Computer Vision, 2021, 129 : 1 - 22
  • [2] View-invariant representation and learning of human action
    Rao, C
    Shah, M
    IEEE WORKSHOP ON DETECTION AND RECOGNITION OF EVENTS IN VIDEO, PROCEEDINGS, 2001, : 55 - 63
  • [3] View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose
    Liu, Ting
    Sun, Jennifer J.
    Zhao, Long
    Zhao, Jiaping
    Yuan, Liangzhe
    Wang, Yuxiao
    Chen, Liang-Chieh
    Schroff, Florian
    Adam, Hartwig
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (01) : 111 - 135
  • [4] View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose
    Ting Liu
    Jennifer J. Sun
    Long Zhao
    Jiaping Zhao
    Liangzhe Yuan
    Yuxiao Wang
    Liang-Chieh Chen
    Florian Schroff
    Hartwig Adam
    International Journal of Computer Vision, 2022, 130 : 111 - 135
  • [5] View-Invariant Skeleton Action Representation Learning via Motion Retargeting
    Yang, Di
    Wang, Yaohui
    Dantcheva, Antitza
    Garattoni, Lorenzo
    Francesca, Gianpiero
    Bremond, Francois
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (07) : 2351 - 2366
  • [6] View-Invariant Pose Analysis for Human Movement Assessment from RGB Data
    Sardari, Faegheh
    Paiement, Adeline
    Mirmehdi, Majid
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2019, PT II, 2019, 11752 : 237 - 248
  • [7] View-Invariant Pose Recognition Using Multilinear Analysis and the Universum
    Peng, Bo
    Qian, Gang
    Ma, Yunqian
    ADVANCES IN VISUAL COMPUTING, PT II, PROCEEDINGS, 2008, 5359 : 581 - +
  • [8] Learning Human Identity Using View-Invariant Multi-view Movement Representation
    Iosifidis, Alexandros
    Tefas, Anastasios
    Nikolaidis, Nikolaos
    Pitas, Ioannis
    BIOMETRICS AND ID MANAGEMENT, 2011, 6583 : 217 - 226
  • [9] Development of a view-invariant representation of the human head
    Gliga, Teodora
    Dehaene-Lambertz, Ghislaine
    COGNITION, 2007, 102 (02) : 261 - 288
  • [10] View-invariant Feature using Pose Information and Flexible Matching Algorithm for Action Retrieval
    Yoshida, Noboru
    Liu, Jianquan
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1556 - 1562