Contrastive Learning with Cross-Part Bidirectional Distillation for Self-supervised Skeleton-Based Action Recognition

被引：0

作者：

Yang, Huaigang ^{[1
]}

Zhang, Qieshi ^{[2
,3
]}

Ren, Ziliang ^{[1
,2
]}

Yuan, Huaqiang ^{[1
]}

Zhang, Fuyong ^{[1
]}

机构：

[1] Dongguan Univ Technol, Sch Comp Sci & Technol, Dongguan, Peoples R China

[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, CAS Key Lab Human Machine Intelligence Synergy Sys, Shenzhen, Peoples R China

[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China

来源：

HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES | 2024年 / 14卷

基金：

中国国家自然科学基金;

关键词：

Skeleton-based Action Recognition; Contrastive Learning; Self-attention; Knowledge Distillation; Skeleton; Segmentation; NETWORKS; LSTM;

D O I：

10.22967/HCIS.2024.14.070

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Since self-supervised learning does not require a large amount of labelled data, some methods have employed self-supervised contrastive learning for 3D skeleton-based action recognition. As the skeleton sequence is a highly correlated data modality, current work only considers the global skeleton sequence after forming different views by data augmentation and input to the contrastive encoding network. Moreover, it does not focus on the local semantic information of the skeleton that leads to certain fine-grained and ambiguous classes of actions on which existing methods may be more difficult to distinguish. Therefore, we propose a self- supervised contrastive learning method with bidirectional knowledge distillation across part streams for skeleton-based action recognition. On the one hand, unlike traditional methods, a pose based factorization of skeleton sequences is performed to form two partial streams, and employ a single partial stream contrastive learning method to encode action features for each of these two streams. On the other hand, we design a contrastive learning framework based on relational knowledge distillation, named cross-part bidirectional distillation (CPBD), to train the upstream self-supervised model in a more reasonable way, and to improve the downstream action recognition accuracy. The proposed recognition framework is evaluated on three datasets: NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD, which achieves the state-of-the-art result performance, and we obtained 92.0% accuracy in PKU-MMD Part I with the linear evaluation protocol. Furthermore, the recognition architecture could distinguish more challenging ambiguous action samples, such as touch head, touch neck, etc.

引用

页数：21

共 50 条

[21] A Bidirectional Separated Distillation-Based Cross-Modal Interactive Fusion Network for Skeleton-Based Action Recognition
Wang, Mingdao
Zhang, Xianlin
Chen, Siqi
Li, Xueming
Zhang, Yue
IEEE SENSORS JOURNAL, 2025, 25 (01) : 1814 - 1824
[22] DIDA: Dynamic Individual-to-integrateD Augmentation for Self-supervised Skeleton-Based Action Recognition
Hu, Haobo
Li, Jianan
Fan, Hongbin
Zhao, Zhifu
Zhou, Yangtao
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 496 - 510
[23] DMMG: Dual Min-Max Games for Self-Supervised Skeleton-Based Action Recognition
Guan, Shannan
Yu, Xin
Huang, Wei
Fang, Gengfa
Lu, Haiyan
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 395 - 407
[24] Multi-Granularity Anchor-Contrastive Representation Learning for Semi-Supervised Skeleton-Based Action Recognition
Shu, Xiangbo
Xu, Binqian
Zhang, Liyan
Tang, Jinhui
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7559 - 7576
[25] SELF-SUPERVISED CONTRASTIVE LEARNING FOR AUDIO-VISUAL ACTION RECOGNITION
Liu, Yang
Tan, Ying
Lan, Haoyuan
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1000 - 1004
[26] Spatiotemporal Decouple-and-Squeeze Contrastive Learning for Semisupervised Skeleton-Based Action Recognition
Xu, Binqian
Shu, Xiangbo
Zhang, Jiachao
Dai, Guangzhao
Song, Yan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11035 - 11048
[27] Efficient Spatio-Temporal Contrastive Learning for Skeleton-Based 3-D Action Recognition
Gao, Xuehao
Yang, Yang
Zhang, Yimeng
Li, Maosen
Yu, Jin-Gang
Du, Shaoyi
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 405 - 417
[28] Structural Knowledge Distillation for Efficient Skeleton-Based Action Recognition
Bian, Cunling
Feng, Wei
Wan, Liang
Wang, Song
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2963 - 2976
[29] Hierarchical Consistent Contrastive Learning for Skeleton-Based Action Recognition with Growing Augmentations
Zhang, Jiahang
Lin, Lilang
Liu, Jiaying
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3427 - 3435
[30] Cross-Model Cross-Stream Learning for Self-Supervised Human Action Recognition
Liu, Mengyuan
Liu, Hong
Guo, Tianyu
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2024, 54 (06) : 743 - 752

← 1 2 3 4 5 →