View Enhanced Jigsaw Puzzle for Self-Supervised Feature Learning in 3D Human Action Recognition

被引:4
|
作者
You, Wei [1 ]
Wang, Xue [1 ]
机构
[1] Tsinghua Univ, Dept Precis Instrument, Beijing 100084, Peoples R China
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Task analysis; Skeleton; Feature extraction; Training; Representation learning; Three-dimensional displays; Image recognition; Action recognition; self-supervised learning; multi-view; pretext task; human skeleton; gate recurrent unit;
D O I
10.1109/ACCESS.2022.3165040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Self-supervised learning methods have received much attention in skeleton-based human action recognition. These methods rely on pretext tasks to utilize unlabeled data and learn an effective feature encoder. In this paper, a novel self-supervised learning method is proposed. First, we design a new pretext task called view enhanced jigsaw puzzle (VEJP) to improve the learning difficulty of the encoder. The VEJP introduces multi-view information into the jigsaw puzzle, thus forcing the encoder to learn view-independent high-level features of human skeletons. Based on the encoder trained by VEJP, we propose the view pooling encoder (VPE) to integrate the information of multiple views with the pooling mechanism, and the features extracted by VPE are more robust and distinguishable. In addition, by adjusting the difficulty of VEJP, the influence of the pretext task difficulty on the downstream task performance is studied, and the experimental results show that the pretext tasks should be moderately difficult to achieve effective feature learning. Our method achieves competitive results on representative benchmark datasets. It provides a strong baseline for the jigsaw puzzle task and shows advantages in situations where the number of labeled data is minimal.
引用
收藏
页码:36385 / 36396
页数:12
相关论文
共 50 条
  • [1] Motion Guided Attention Learning for Self-Supervised 3D Human Action Recognition
    Yang, Yang
    Liu, Guangjun
    Gao, Xuehao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8623 - 8634
  • [2] 3D Human Pose Machines with Self-Supervised Learning
    Wang, Keze
    Lin, Liang
    Jiang, Chenhan
    Qian, Chen
    Wei, Pengxu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (05) : 1069 - 1082
  • [3] Collaboratively Self-Supervised Video Representation Learning for Action Recognition
    Zhang, Jie
    Wan, Zhifan
    Hu, Lanqing
    Lin, Stephen
    Wu, Shuzhe
    Shan, Shiguang
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 1895 - 1907
  • [4] Visual Reinforcement Learning With Self-Supervised 3D Representations
    Ze, Yanjie
    Hansen, Nicklas
    Chen, Yinbo
    Jain, Mohit
    Wang, Xiaolong
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (05) : 2890 - 2897
  • [5] Self-Supervised 3-D Action Recognition by Contrasting Context-Enhanced Action Embeddings
    Ye, Kenan
    Zhao, Brian Nlong
    Liang, Shuang
    Yao, Han
    Jia, Wenzhen
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025,
  • [6] Exploring Self-Supervised Learning for 3D Point Cloud Registration
    Yuan, Mingzhi
    Huang, Qiao
    Shen, Ao
    Huang, Xiaoshui
    Wang, Manning
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (01): : 25 - 31
  • [7] Enhanced Industrial Action Recognition Through Self-Supervised Visual Transformers
    Xiao, Yao
    Xiang, Hua
    Wang, Tongxi
    Wang, Yiju
    IEEE ACCESS, 2024, 12 : 134133 - 134143
  • [8] Attention-guided mask learning for self-supervised 3D action recognition
    Zhang, Haoyuan
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (06) : 7487 - 7496
  • [9] Multi-View 3D Human Pose Estimation with Self-Supervised Learning
    Chang, Inho
    Park, Min-Gyu
    Kim, Jaewoo
    Yoon, Ju Hong
    3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 255 - 257
  • [10] Self-Supervised Learning of Detailed 3D Face Reconstruction
    Chen, Yajing
    Wu, Fanzi
    Wang, Zeyu
    Song, Yibing
    Ling, Yonggen
    Bao, Linchao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8696 - 8705