View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose

被引:0
|
作者
Ting Liu
Jennifer J. Sun
Long Zhao
Jiaping Zhao
Liangzhe Yuan
Yuxiao Wang
Liang-Chieh Chen
Florian Schroff
Hartwig Adam
机构
[1] Google Research,
[2] California Institute of Technology,undefined
[3] Rutgers University,undefined
来源
关键词
Human pose embedding; Probabilistic embedding; View-invariant pose retrieval; Action retrieval; Occlusion Robustness;
D O I
暂无
中图分类号
学科分类号
摘要
Recognition of human poses and actions is crucial for autonomous systems to interact smoothly with people. However, cameras generally capture human poses in 2D as images and videos, which can have significant appearance variations across viewpoints that make the recognition tasks challenging. To address this, we explore recognizing similarity in 3D human body poses from 2D information, which has not been well-studied in existing works. Here, we propose an approach to learning a compact view-invariant embedding space from 2D body joint keypoints, without explicitly predicting 3D poses. Input ambiguities of 2D poses from projection and occlusion are difficult to represent through a deterministic mapping, and therefore we adopt a probabilistic formulation for our embedding space. Experimental results show that our embedding model achieves higher accuracy when retrieving similar poses across different camera views, in comparison with 3D pose estimation models. We also show that by training a simple temporal embedding model, we achieve superior performance on pose sequence retrieval and largely reduce the embedding dimension from stacking frame-based embeddings for efficient large-scale retrieval. Furthermore, in order to enable our embeddings to work with partially visible input, we further investigate different keypoint occlusion augmentation strategies during training. We demonstrate that these occlusion augmentations significantly improve retrieval performance on partial 2D input poses. Results on action recognition and video alignment demonstrate that using our embeddings without any additional training achieves competitive performance relative to other models specifically trained for each task.
引用
收藏
页码:111 / 135
页数:24
相关论文
共 50 条
  • [41] Occlusion-Robust 3D Hand Pose Estimation from a Single RGB Image
    Ishii, Asuka
    Nakano, Gaku
    Inoshita, Tetsuo
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [42] Camera pose self-calibration-based view-invariant trajectory analysis with monocular vision
    Zhang, Mo-Yi
    Zhang, Qiu-Yu
    Duan, Hong-Xiang
    Wei, Hui-Yi
    Journal of Computers (Taiwan), 2020, 31 (02) : 212 - 226
  • [43] Visual-based view-invariant human motion analysis: A review
    Ji, Xiaofei
    Liu, Houghai
    Li, Yibo
    Brown, David
    KNOWLEDGE - BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2008, 5177 : 741 - +
  • [44] Learning Human Identity Using View-Invariant Multi-view Movement Representation
    Iosifidis, Alexandros
    Tefas, Anastasios
    Nikolaidis, Nikolaos
    Pitas, Ioannis
    BIOMETRICS AND ID MANAGEMENT, 2011, 6583 : 217 - 226
  • [45] View-invariant human feature extraction for video-surveillance applications
    Rogez, Gregory
    Guerrero, J. J.
    Orrite, Carlos
    2007 IEEE CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, 2007, : 324 - +
  • [46] View-invariant human activity recognition based on shape and motion features
    Niu, F.
    Abdel-Mottaleb, M.
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2007, 22 (03): : 235 - 243
  • [47] View-invariant representation of hand postures in the human lateral occipitotemporal cortex
    Bracci, Stefania
    Caramazza, Alfonso
    Peelen, Marius, V
    NEUROIMAGE, 2018, 181 : 446 - 452
  • [48] Three-dimensional view-invariant face recognition using a hierarchical pose-normalization strategy
    Martin D. Levine
    Ajit Rajwade
    Machine Vision and Applications, 2006, 17 : 309 - 325
  • [49] VIEW-INVARIANT ACTION RECOGNITION FROM RGB DATA VIA 3D POSE ESTIMATION
    Baptista, Renato
    Ghorbel, Enjie
    Papadopoulos, Konstantinos
    Demisse, Girum G.
    Aouada, Djamila
    Ottersten, Bjorn
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2542 - 2546
  • [50] Three-dimensional view-invariant face recognition using a hierarchical pose-normalization strategy
    Levine, Martin D.
    Rajwade, Ajit
    MACHINE VISION AND APPLICATIONS, 2006, 17 (05) : 309 - 325