Contrastive Learning with Cross-Part Bidirectional Distillation for Self-supervised Skeleton-Based Action Recognition

被引：0

作者：

Yang, Huaigang ^{[1
]}

Zhang, Qieshi ^{[2
,3
]}

Ren, Ziliang ^{[1
,2
]}

Yuan, Huaqiang ^{[1
]}

Zhang, Fuyong ^{[1
]}

机构：

[1] Dongguan Univ Technol, Sch Comp Sci & Technol, Dongguan, Peoples R China

[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, CAS Key Lab Human Machine Intelligence Synergy Sys, Shenzhen, Peoples R China

[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China

来源：

HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES | 2024年 / 14卷

基金：

中国国家自然科学基金;

关键词：

Skeleton-based Action Recognition; Contrastive Learning; Self-attention; Knowledge Distillation; Skeleton; Segmentation; NETWORKS; LSTM;

D O I：

10.22967/HCIS.2024.14.070

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Since self-supervised learning does not require a large amount of labelled data, some methods have employed self-supervised contrastive learning for 3D skeleton-based action recognition. As the skeleton sequence is a highly correlated data modality, current work only considers the global skeleton sequence after forming different views by data augmentation and input to the contrastive encoding network. Moreover, it does not focus on the local semantic information of the skeleton that leads to certain fine-grained and ambiguous classes of actions on which existing methods may be more difficult to distinguish. Therefore, we propose a self- supervised contrastive learning method with bidirectional knowledge distillation across part streams for skeleton-based action recognition. On the one hand, unlike traditional methods, a pose based factorization of skeleton sequences is performed to form two partial streams, and employ a single partial stream contrastive learning method to encode action features for each of these two streams. On the other hand, we design a contrastive learning framework based on relational knowledge distillation, named cross-part bidirectional distillation (CPBD), to train the upstream self-supervised model in a more reasonable way, and to improve the downstream action recognition accuracy. The proposed recognition framework is evaluated on three datasets: NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD, which achieves the state-of-the-art result performance, and we obtained 92.0% accuracy in PKU-MMD Part I with the linear evaluation protocol. Furthermore, the recognition architecture could distinguish more challenging ambiguous action samples, such as touch head, touch neck, etc.

引用

页数：21

共 50 条

[31] A Cross View Learning Approach for Skeleton-Based Action Recognition
Zheng, Hui
Zhang, Xinming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3061 - 3072
[32] CdCLR: Clip- Driven Contrastive Learning for Skeleton-Based Action Recognition
Gao, Rong
Liu, Xin
Yang, Jingyu
Yue, Huanjing
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
[33] Adaptive multi-level graph convolution with contrastive learning for skeleton-based action recognition
Geng, Pei
Li, Haowei
Wang, Fuyun
Lyu, Lei
SIGNAL PROCESSING, 2022, 201
[34] Self-Supervised Action Representation Learning Based on Asymmetric Skeleton Data Augmentation
Zhou, Hualing
Li, Xi
Xu, Dahong
Liu, Hong
Guo, Jianping
Zhang, Yihan
SENSORS, 2022, 22 (22)
[35] Contrastive Self-Supervised Learning for Optical Music Recognition
Penarrubia, Carlos
Valero-Mas, Jose J.
Calvo-Zaragoza, Jorge
DOCUMENT ANALYSIS SYSTEMS, DAS 2024, 2024, 14994 : 312 - 326
[36] ACL-SAR: model agnostic adversarial contrastive learning for robust skeleton-based action recognition
Zhu, Jiaxuan
Shao, Ming
Sun, Libo
Xia, Siyu
VISUAL COMPUTER, 2025, 41 (04) : 2495 - 2510
[37] Self-Supervised Contrastive Learning for Camera-to-Radar Knowledge Distillation
Wang, Wenpeng
Campbell, Bradford
Munir, Sirajum
2024 20TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SMART SYSTEMS AND THE INTERNET OF THINGS, DCOSS-IOT 2024, 2024, : 154 - 161
[38] Self-Supervised Contrastive Learning on Cross-Augmented Samples for SAR Target Recognition
Liu, Xiaoyu
Wang, Chenwei
Pei, Jifang
Huo, Weibo
Zhang, Yin
Huang, Yulin
Sun, Zhichao
2023 IEEE RADAR CONFERENCE, RADARCONF23, 2023,
[39] Self-supervised contrastive learning for EEG-based cross-subject motor imagery recognition
Li, Wenjie
Li, Haoyu
Sun, Xinlin
Kang, Huicong
An, Shan
Wang, Guoxin
Gao, Zhongke
JOURNAL OF NEURAL ENGINEERING, 2024, 21 (02)
[40] Improving self-supervised action recognition from extremely augmented skeleton sequences
Guo, Tianyu
Liu, Mengyuan
Liu, Hong
Wang, Guoquan
Li, Wenhao
PATTERN RECOGNITION, 2024, 150

← 1 2 3 4 5 →