Contrastive Learning with Cross-Part Bidirectional Distillation for Self-supervised Skeleton-Based Action Recognition

被引:0
|
作者
Yang, Huaigang [1 ]
Zhang, Qieshi [2 ,3 ]
Ren, Ziliang [1 ,2 ]
Yuan, Huaqiang [1 ]
Zhang, Fuyong [1 ]
机构
[1] Dongguan Univ Technol, Sch Comp Sci & Technol, Dongguan, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, CAS Key Lab Human Machine Intelligence Synergy Sys, Shenzhen, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Skeleton-based Action Recognition; Contrastive Learning; Self-attention; Knowledge Distillation; Skeleton; Segmentation; NETWORKS; LSTM;
D O I
10.22967/HCIS.2024.14.070
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Since self-supervised learning does not require a large amount of labelled data, some methods have employed self-supervised contrastive learning for 3D skeleton-based action recognition. As the skeleton sequence is a highly correlated data modality, current work only considers the global skeleton sequence after forming different views by data augmentation and input to the contrastive encoding network. Moreover, it does not focus on the local semantic information of the skeleton that leads to certain fine-grained and ambiguous classes of actions on which existing methods may be more difficult to distinguish. Therefore, we propose a self- supervised contrastive learning method with bidirectional knowledge distillation across part streams for skeleton-based action recognition. On the one hand, unlike traditional methods, a pose based factorization of skeleton sequences is performed to form two partial streams, and employ a single partial stream contrastive learning method to encode action features for each of these two streams. On the other hand, we design a contrastive learning framework based on relational knowledge distillation, named cross-part bidirectional distillation (CPBD), to train the upstream self-supervised model in a more reasonable way, and to improve the downstream action recognition accuracy. The proposed recognition framework is evaluated on three datasets: NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD, which achieves the state-of-the-art result performance, and we obtained 92.0% accuracy in PKU-MMD Part I with the linear evaluation protocol. Furthermore, the recognition architecture could distinguish more challenging ambiguous action samples, such as touch head, touch neck, etc.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
    Lin, Lilang
    Wu, Lehong
    Zhang, Jiahang
    Wang, Jiaying
    COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 75 - 92
  • [42] Cross-Scale Spatiotemporal Refinement Learning for Skeleton-Based Action Recognition
    Zhang, Yu
    Sun, Zhonghua
    Dai, Meng
    Feng, Jinchao
    Jia, Kebin
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 441 - 445
  • [43] Skeleton-based Action Recognition via Adaptive Cross-Form Learning
    Wang, Xuanhan
    Dai, Yan
    Gao, Lianli
    Song, Jingkuan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1670 - 1678
  • [44] JointContrast: Skeleton-Based Interaction Recognition with New Representation and Contrastive Learning
    Zhang, Ji
    Jia, Xiangze
    Wang, Zhen
    Luo, Yonglong
    Chen, Fulong
    Yang, Gaoming
    Zhao, Lihui
    ALGORITHMS, 2023, 16 (04)
  • [45] Self-supervised group meiosis contrastive learning for EEG-based emotion recognition
    Haoning Kan
    Jiale Yu
    Jiajin Huang
    Zihe Liu
    Heqian Wang
    Haiyan Zhou
    Applied Intelligence, 2023, 53 : 27207 - 27225
  • [46] Contrastive Self-Supervised Learning for Sensor-Based Human Activity Recognition: A Review
    Chen, Hui
    Gouin-Vallerand, Charles
    Bouchard, Kevin
    Gaboury, Sebastien
    Couture, Melanie
    Bier, Nathalie
    Giroux, Sylvain
    IEEE ACCESS, 2024, 12 : 152511 - 152531
  • [47] Self-supervised group meiosis contrastive learning for EEG-based emotion recognition
    Kan, Haoning
    Yu, Jiale
    Huang, Jiajin
    Liu, Zihe
    Wang, Heqian
    Zhou, Haiyan
    APPLIED INTELLIGENCE, 2023, 53 (22) : 27207 - 27225
  • [48] Global-local contrastive multiview representation learning for skeleton-based action
    Bian, Cunling
    Feng, Wei
    Meng, Fanbo
    Wang, Song
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 229
  • [49] A Short Survey on Deep Learning for Skeleton-based Action Recognition
    Wang, Wei
    Zhang, Yu-Dong
    COMPANION PROCEEDINGS OF THE 14TH IEEE/ACM INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC'21 COMPANION), 2021,
  • [50] Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition
    Du, Yong
    Fu, Yun
    Wang, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3010 - 3022