View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose

被引:0
|
作者
Ting Liu
Jennifer J. Sun
Long Zhao
Jiaping Zhao
Liangzhe Yuan
Yuxiao Wang
Liang-Chieh Chen
Florian Schroff
Hartwig Adam
机构
[1] Google Research,
[2] California Institute of Technology,undefined
[3] Rutgers University,undefined
来源
关键词
Human pose embedding; Probabilistic embedding; View-invariant pose retrieval; Action retrieval; Occlusion Robustness;
D O I
暂无
中图分类号
学科分类号
摘要
Recognition of human poses and actions is crucial for autonomous systems to interact smoothly with people. However, cameras generally capture human poses in 2D as images and videos, which can have significant appearance variations across viewpoints that make the recognition tasks challenging. To address this, we explore recognizing similarity in 3D human body poses from 2D information, which has not been well-studied in existing works. Here, we propose an approach to learning a compact view-invariant embedding space from 2D body joint keypoints, without explicitly predicting 3D poses. Input ambiguities of 2D poses from projection and occlusion are difficult to represent through a deterministic mapping, and therefore we adopt a probabilistic formulation for our embedding space. Experimental results show that our embedding model achieves higher accuracy when retrieving similar poses across different camera views, in comparison with 3D pose estimation models. We also show that by training a simple temporal embedding model, we achieve superior performance on pose sequence retrieval and largely reduce the embedding dimension from stacking frame-based embeddings for efficient large-scale retrieval. Furthermore, in order to enable our embeddings to work with partially visible input, we further investigate different keypoint occlusion augmentation strategies during training. We demonstrate that these occlusion augmentations significantly improve retrieval performance on partial 2D input poses. Results on action recognition and video alignment demonstrate that using our embeddings without any additional training achieves competitive performance relative to other models specifically trained for each task.
引用
收藏
页码:111 / 135
页数:24
相关论文
共 50 条
  • [31] View-Invariant and Similarity Learning for Robust Person Re-Identification
    Ainam, Jean-Paul
    Qin, Ke
    Liu, Guisong
    Luo, Guangchun
    IEEE ACCESS, 2019, 7 : 185486 - 185495
  • [32] VIBR: LEARNING VIEW-INVARIANT VALUE FUNCTIONS FOR ROBUST VISUAL CONTROL
    Dupuis, Tom
    Rabarisoa, Jaonary
    Quoc-Cuong Pham
    Filliat, David
    CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 658 - 682
  • [33] Latent Embedding Clustering for Occlusion Robust Head Pose Estimation
    Celestino, Jose
    Marques, Manuel
    Nascimento, Jacinto C.
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [34] View-invariant Feature using Pose Information and Flexible Matching Algorithm for Action Retrieval
    Yoshida, Noboru
    Liu, Jianquan
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1556 - 1562
  • [35] View-invariant modeling and recognition of human actions using grammars
    Ogale, Abhijit S.
    Karapurkar, Alap
    Aloimonos, Yiannis
    DYNAMICAL VISION, 2007, 4358 : 115 - +
  • [36] Robust null space representation and sampling for view-invariant motion trajectory analysis
    Chen, Xu
    Schonfeld, Dan
    Khokhar, Ashfaq
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 2902 - 2907
  • [37] Occlusion-robust Face Alignment using A Viewpoint-invariant Hierarchical Network Architecture
    Zhu, Congcong
    Wan, Xintong
    Xie, Shaorong
    Li, Xiaoqiang
    Gu, Yinzheng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11102 - 11111
  • [38] View-Invariant Human Action Recognition Via View Transformation Network (VTN)
    Gao, Lingling
    Ji, Yanli
    Gedamu, Kumie
    Zhu, Xiaofeng
    Xu, Xing
    Shen, Heng Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4493 - 4503
  • [39] T-LEAP: Occlusion-robust pose estimation of walking cows using temporal information
    Russello, Helena
    Tol, Rik van der
    Kootstra, Gert
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 192
  • [40] Joint Object Recognition and Pose Estimation using a Nonlinear View-Invariant Latent Generative Model
    Bakry, Amr
    Elgaaly, Tarek
    Elhoseiny, Mohamed
    Elgammal, Ahmed
    2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,