View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose

被引:0
|
作者
Ting Liu
Jennifer J. Sun
Long Zhao
Jiaping Zhao
Liangzhe Yuan
Yuxiao Wang
Liang-Chieh Chen
Florian Schroff
Hartwig Adam
机构
[1] Google Research,
[2] California Institute of Technology,undefined
[3] Rutgers University,undefined
来源
关键词
Human pose embedding; Probabilistic embedding; View-invariant pose retrieval; Action retrieval; Occlusion Robustness;
D O I
暂无
中图分类号
学科分类号
摘要
Recognition of human poses and actions is crucial for autonomous systems to interact smoothly with people. However, cameras generally capture human poses in 2D as images and videos, which can have significant appearance variations across viewpoints that make the recognition tasks challenging. To address this, we explore recognizing similarity in 3D human body poses from 2D information, which has not been well-studied in existing works. Here, we propose an approach to learning a compact view-invariant embedding space from 2D body joint keypoints, without explicitly predicting 3D poses. Input ambiguities of 2D poses from projection and occlusion are difficult to represent through a deterministic mapping, and therefore we adopt a probabilistic formulation for our embedding space. Experimental results show that our embedding model achieves higher accuracy when retrieving similar poses across different camera views, in comparison with 3D pose estimation models. We also show that by training a simple temporal embedding model, we achieve superior performance on pose sequence retrieval and largely reduce the embedding dimension from stacking frame-based embeddings for efficient large-scale retrieval. Furthermore, in order to enable our embeddings to work with partially visible input, we further investigate different keypoint occlusion augmentation strategies during training. We demonstrate that these occlusion augmentations significantly improve retrieval performance on partial 2D input poses. Results on action recognition and video alignment demonstrate that using our embeddings without any additional training achieves competitive performance relative to other models specifically trained for each task.
引用
收藏
页码:111 / 135
页数:24
相关论文
共 50 条
  • [21] View-Invariant 3D Human Body Pose Reconstruction using a Monocular Video Camera
    Ke, Shian-Ru
    Hwang, Jenq-Neng
    Lan, Kung-Ming
    Wang, Shen-Zheng
    2011 FIFTH ACM/IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED SMART CAMERAS (ICDSC), 2011,
  • [22] View-invariant recognition of body pose from space-time templates
    Shen, Yuping
    Foroosh, Hassan
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 3522 - 3527
  • [23] View-Invariant Robot Adaptation to Human Action Timing
    Noceti, Nicoletta
    Odone, Francesca
    Rea, Francesco
    Sciutti, Alessandra
    Sandini, Giulio
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 804 - 821
  • [24] A New Method of View-Invariant Human Activity Recognition
    Su, Han
    Wang, Wenjie
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON LOGISTICS, ENGINEERING, MANAGEMENT AND COMPUTER SCIENCE (LEMCS 2015), 2015, 117 : 1648 - 1652
  • [25] Towards Fast, View-Invariant Human Action Recognition
    Cherla, Srikanth
    Kulkarni, Kaustubh
    Kale, Amit
    Ramasubramanian, V.
    2008 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, VOLS 1-3, 2008, : 1650 - 1657
  • [26] Advances in View-Invariant Human Motion Analysis: A Review
    Ji, Xiaofei
    Liu, Honghai
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2010, 40 (01): : 13 - 24
  • [27] Dense and Occlusion-Robust Multi-View Stereo for Unstructured Videos
    Wei, Jian
    Resch, Benjamin
    Lensch, Hendrik P. A.
    2016 13TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV), 2016, : 69 - 76
  • [28] Highly Robust Action Retrieval using View-invariant Pose Feature and Simple yet Effective Query Expansion Method
    Yoshida, Noboru
    Liu, Jianquan
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1269 - 1277
  • [29] A survey about view-invariant human action recognition
    Nghia Pham Trong
    Anh Truong Minh
    Nguyen, Hung
    Kazunori, Kotani
    Bac Le Hoai
    2017 56TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2017, : 699 - 704
  • [30] Self-Supervised Video Pose Representation Learning for Occlusion-Robust Action Recognition
    Yang, Di
    Wang, Yaohui
    Dantcheva, Antitza
    Garattoni, Lorenzo
    Francesca, Gianpiero
    Bremond, Francois
    2021 16TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2021), 2021,