X-Invariant Contrastive Augmentation and Representation Learning for Semi-Supervised Skeleton-Based Action Recognition

被引:80
作者
Xu, Binqian [1 ]
Shu, Xiangbo [1 ]
Song, Yan [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
基金
中国国家自然科学基金;
关键词
Skeleton; Representation learning; Joints; Bones; Semisupervised learning; Recurrent neural networks; Hidden Markov models; Action recognition; skeleton; semi-supervised; contrastive learning;
D O I
10.1109/TIP.2022.3175605
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised skeleton-based action recognition is a challenging problem due to insufficient labeled data. For addressing this problem, some representative methods leverage contrastive learning to obtain more features from the pre-augmented skeleton actions. Such methods usually adopt a two-stage way: first randomly augment samples, and then learn their representations via contrastive learning. Since skeleton samples have already been randomly augmented, the representation ability of the subsequent contrastive learning is limited due to the inconsistency between the augmentations and representations. Thus, we propose a novel X-invariant Contrastive Augmentation and Representation learning (X-CAR) framework to thoroughly obtain rotate-shear-scale (X for short) invariant features by learning augmentations and representations of skeleton sequences in a one-stage way. In X-CAR, a new Adaptive-combination Augmentation (AA) mechanism is designed to rotate, shear, and scale the skeletons by learnable controlling factors in an adaptive way rather than a random way. Here, such controlling factors are also learned in the whole contrastive learning process, which can facilitate the consistency between the learned augmentations and representations of skeleton sequences. In addition, we relax the pre-definition of positive and negative samples to avoid the confusing allocation of ambiguous samples, and present a new Pull-Push Contrastive Loss (PPCL) to pull the augmenting skeleton close to the original skeleton, while push far away from the other skeletons. Experimental results on both NTU RGB+D and North-Western UCLA datasets show that the proposed X-CAR achieves better accuracy compared with other competitive methods in the semi-supervised scenario.
引用
收藏
页码:3852 / 3867
页数:16
相关论文
共 50 条
[31]   Semi-Supervised Action Recognition From Temporal Augmentation Using Curriculum Learning [J].
Tong, Anyang ;
Tang, Chao ;
Wang, Wenjian .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) :1305-1319
[32]   Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition [J].
Lin, Lilang ;
Wu, Lehong ;
Zhang, Jiahang ;
Wang, Jiaying .
COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 :75-92
[33]   RSCC: Robust Semi-supervised Learning with Contrastive Learning and Augmentation Consistency Regularization [J].
Jing, Xinran ;
Wang, Yongli .
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, PT 1, IAIC 2023, 2024, 2058 :142-155
[34]   Attention decoupled contrastive learning for semi-supervised segmentation method based on data augmentation [J].
Pan, Pan ;
Chen, Houjin ;
Li, Yanfeng ;
Peng, Wanru ;
Cheng, Lin .
PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (12)
[35]   Efficient Spatio-Temporal Contrastive Learning for Skeleton-Based 3-D Action Recognition [J].
Gao, Xuehao ;
Yang, Yang ;
Zhang, Yimeng ;
Li, Maosen ;
Yu, Jin-Gang ;
Du, Shaoyi .
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :405-417
[36]   CdCLR: Clip- Driven Contrastive Learning for Skeleton-Based Action Recognition [J].
Gao, Rong ;
Liu, Xin ;
Yang, Jingyu ;
Yue, Huanjing .
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
[37]   Semi-Supervised Graph Contrastive Learning With Virtual Adversarial Augmentation [J].
Dong, Yixiang ;
Luo, Minnan ;
Li, Jundong ;
Liu, Ziqi ;
Zheng, Qinghua .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (08) :4232-4244
[38]   Semi-supervised Anatomy Tracking with Contrastive Representation Learning in Ultrasound Sequences [J].
Liang, Hanying ;
Ning, Guochen ;
Zhang, Xinran ;
Liao, Hongen .
2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
[39]   GRA: Graph Representation Alignment for Semi-Supervised Action Recognition [J].
Huang, Kuan-Hung ;
Huang, Yao-Bang ;
Lin, Yong-Xiang ;
Hua, Kai-Lung ;
Tanveer, M. ;
Lu, Xuequan ;
Razzak, Imran .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) :11896-11905
[40]   Self-Supervised Action Representation Learning Based on Asymmetric Skeleton Data Augmentation [J].
Zhou, Hualing ;
Li, Xi ;
Xu, Dahong ;
Liu, Hong ;
Guo, Jianping ;
Zhang, Yihan .
SENSORS, 2022, 22 (22)