Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition

被引:0
|
作者
Lin, Lilang [1 ]
Wu, Lehong [1 ]
Zhang, Jiahang [1 ]
Wang, Jiaying [1 ]
机构
[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China
来源
COMPUTER VISION - ECCV 2024, PT XXVI | 2025年 / 15084卷
基金
中国国家自然科学基金;
关键词
Self-supervised learning; skeleton-based action recognition; contrastive learning;
D O I
10.1007/978-3-031-73347-5_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative models, as a powerful technique for generation, also gradually become a critical tool for recognition tasks. However, in skeleton-based action recognition, the features obtained from existing pre-trained generative methods contain redundant information unrelated to recognition, which contradicts the nature of the skeleton's spatially sparse and temporally consistent properties, leading to undesirable performance. To address this challenge, we make efforts to bridge the gap in theory and methodology and propose a novel skeleton-based idempotent generative model (IGM) for unsupervised representation learning. More specifically, we first theoretically demonstrate the equivalence between generative models and maximum entropy coding, which demonstrates a potential route that makes the features of generative models more compact by introducing contrastive learning. To this end, we introduce the idempotency constraint to form a stronger consistency regularization in the feature space, to push the features only to maintain the critical information of motion semantics for the recognition task. Our extensive experiments on benchmark datasets, NTU RGB+D and PKUMMD, demonstrate the effectiveness of our proposed method. On the NTU 60 xsub dataset, we observe a performance improvement from 84.6% to 86.2%. Furthermore, in zero-shot adaptation scenarios, our model demonstrates significant efficacy by achieving promising results in cases that were previously unrecognizable. Our project is available at https://github.com/LanglandsLin/IGM.
引用
收藏
页码:75 / 92
页数:18
相关论文
共 50 条
  • [21] Global-local contrastive multiview representation learning for skeleton-based action
    Bian, Cunling
    Feng, Wei
    Meng, Fanbo
    Wang, Song
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 229
  • [22] X-Invariant Contrastive Augmentation and Representation Learning for Semi-Supervised Skeleton-Based Action Recognition
    Xu, Binqian
    Shu, Xiangbo
    Song, Yan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3852 - 3867
  • [23] Cross-Scale Spatiotemporal Refinement Learning for Skeleton-Based Action Recognition
    Zhang, Yu
    Sun, Zhonghua
    Dai, Meng
    Feng, Jinchao
    Jia, Kebin
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 441 - 445
  • [24] Temporal-masked skeleton-based action recognition with supervised contrastive learning
    Zhao, Zhifeng
    Chen, Guodong
    Lin, Yuxiang
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (05) : 2267 - 2275
  • [25] Feature difference and feature correlation learning mechanism for skeleton-based action recognition
    Qing, Ruxin
    Jiang, Min
    Kong, Jun
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
  • [26] Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition
    Tang, Yansong
    Liu, Xingyu
    Yu, Xumin
    Zhang, Danyang
    Lu, Jiwen
    Zhou, Jie
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (02)
  • [27] CdCLR: Clip- Driven Contrastive Learning for Skeleton-Based Action Recognition
    Gao, Rong
    Liu, Xin
    Yang, Jingyu
    Yue, Huanjing
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [28] Learning Heterogeneous Spatial-Temporal Context for Skeleton-Based Action Recognition
    Gao, Xuehao
    Yang, Yang
    Wu, Yang
    Du, Shaoyi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12130 - 12141
  • [29] Temporal-masked skeleton-based action recognition with supervised contrastive learning
    Zhifeng Zhao
    Guodong Chen
    Yuxiang Lin
    Signal, Image and Video Processing, 2023, 17 : 2267 - 2275
  • [30] Adaptive multi-level graph convolution with contrastive learning for skeleton-based action recognition
    Geng, Pei
    Li, Haowei
    Wang, Fuyun
    Lyu, Lei
    SIGNAL PROCESSING, 2022, 201