Contrastive Learning with Cross-Part Bidirectional Distillation for Self-supervised Skeleton-Based Action Recognition

被引:0
|
作者
Yang, Huaigang [1 ]
Zhang, Qieshi [2 ,3 ]
Ren, Ziliang [1 ,2 ]
Yuan, Huaqiang [1 ]
Zhang, Fuyong [1 ]
机构
[1] Dongguan Univ Technol, Sch Comp Sci & Technol, Dongguan, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, CAS Key Lab Human Machine Intelligence Synergy Sys, Shenzhen, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Skeleton-based Action Recognition; Contrastive Learning; Self-attention; Knowledge Distillation; Skeleton; Segmentation; NETWORKS; LSTM;
D O I
10.22967/HCIS.2024.14.070
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Since self-supervised learning does not require a large amount of labelled data, some methods have employed self-supervised contrastive learning for 3D skeleton-based action recognition. As the skeleton sequence is a highly correlated data modality, current work only considers the global skeleton sequence after forming different views by data augmentation and input to the contrastive encoding network. Moreover, it does not focus on the local semantic information of the skeleton that leads to certain fine-grained and ambiguous classes of actions on which existing methods may be more difficult to distinguish. Therefore, we propose a self- supervised contrastive learning method with bidirectional knowledge distillation across part streams for skeleton-based action recognition. On the one hand, unlike traditional methods, a pose based factorization of skeleton sequences is performed to form two partial streams, and employ a single partial stream contrastive learning method to encode action features for each of these two streams. On the other hand, we design a contrastive learning framework based on relational knowledge distillation, named cross-part bidirectional distillation (CPBD), to train the upstream self-supervised model in a more reasonable way, and to improve the downstream action recognition accuracy. The proposed recognition framework is evaluated on three datasets: NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD, which achieves the state-of-the-art result performance, and we obtained 92.0% accuracy in PKU-MMD Part I with the linear evaluation protocol. Furthermore, the recognition architecture could distinguish more challenging ambiguous action samples, such as touch head, touch neck, etc.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] A Cross View Learning Approach for Skeleton-Based Action Recognition
    Zheng, Hui
    Zhang, Xinming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3061 - 3072
  • [32] CdCLR: Clip- Driven Contrastive Learning for Skeleton-Based Action Recognition
    Gao, Rong
    Liu, Xin
    Yang, Jingyu
    Yue, Huanjing
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [33] Adaptive multi-level graph convolution with contrastive learning for skeleton-based action recognition
    Geng, Pei
    Li, Haowei
    Wang, Fuyun
    Lyu, Lei
    SIGNAL PROCESSING, 2022, 201
  • [34] Self-Supervised Action Representation Learning Based on Asymmetric Skeleton Data Augmentation
    Zhou, Hualing
    Li, Xi
    Xu, Dahong
    Liu, Hong
    Guo, Jianping
    Zhang, Yihan
    SENSORS, 2022, 22 (22)
  • [35] Contrastive Self-Supervised Learning for Optical Music Recognition
    Penarrubia, Carlos
    Valero-Mas, Jose J.
    Calvo-Zaragoza, Jorge
    DOCUMENT ANALYSIS SYSTEMS, DAS 2024, 2024, 14994 : 312 - 326
  • [36] ACL-SAR: model agnostic adversarial contrastive learning for robust skeleton-based action recognition
    Zhu, Jiaxuan
    Shao, Ming
    Sun, Libo
    Xia, Siyu
    VISUAL COMPUTER, 2025, 41 (04) : 2495 - 2510
  • [37] Self-Supervised Contrastive Learning for Camera-to-Radar Knowledge Distillation
    Wang, Wenpeng
    Campbell, Bradford
    Munir, Sirajum
    2024 20TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SMART SYSTEMS AND THE INTERNET OF THINGS, DCOSS-IOT 2024, 2024, : 154 - 161
  • [38] Self-Supervised Contrastive Learning on Cross-Augmented Samples for SAR Target Recognition
    Liu, Xiaoyu
    Wang, Chenwei
    Pei, Jifang
    Huo, Weibo
    Zhang, Yin
    Huang, Yulin
    Sun, Zhichao
    2023 IEEE RADAR CONFERENCE, RADARCONF23, 2023,
  • [39] Self-supervised contrastive learning for EEG-based cross-subject motor imagery recognition
    Li, Wenjie
    Li, Haoyu
    Sun, Xinlin
    Kang, Huicong
    An, Shan
    Wang, Guoxin
    Gao, Zhongke
    JOURNAL OF NEURAL ENGINEERING, 2024, 21 (02)
  • [40] Improving self-supervised action recognition from extremely augmented skeleton sequences
    Guo, Tianyu
    Liu, Mengyuan
    Liu, Hong
    Wang, Guoquan
    Li, Wenhao
    PATTERN RECOGNITION, 2024, 150