X-Invariant Contrastive Augmentation and Representation Learning for Semi-Supervised Skeleton-Based Action Recognition

被引：80

作者：

Xu, Binqian ^{[1
]}

Shu, Xiangbo ^{[1
]}

Song, Yan ^{[1
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2022年 / 31卷

基金：

中国国家自然科学基金;

关键词：

Skeleton; Representation learning; Joints; Bones; Semisupervised learning; Recurrent neural networks; Hidden Markov models; Action recognition; skeleton; semi-supervised; contrastive learning;

D O I：

10.1109/TIP.2022.3175605

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semi-supervised skeleton-based action recognition is a challenging problem due to insufficient labeled data. For addressing this problem, some representative methods leverage contrastive learning to obtain more features from the pre-augmented skeleton actions. Such methods usually adopt a two-stage way: first randomly augment samples, and then learn their representations via contrastive learning. Since skeleton samples have already been randomly augmented, the representation ability of the subsequent contrastive learning is limited due to the inconsistency between the augmentations and representations. Thus, we propose a novel X-invariant Contrastive Augmentation and Representation learning (X-CAR) framework to thoroughly obtain rotate-shear-scale (X for short) invariant features by learning augmentations and representations of skeleton sequences in a one-stage way. In X-CAR, a new Adaptive-combination Augmentation (AA) mechanism is designed to rotate, shear, and scale the skeletons by learnable controlling factors in an adaptive way rather than a random way. Here, such controlling factors are also learned in the whole contrastive learning process, which can facilitate the consistency between the learned augmentations and representations of skeleton sequences. In addition, we relax the pre-definition of positive and negative samples to avoid the confusing allocation of ambiguous samples, and present a new Pull-Push Contrastive Loss (PPCL) to pull the augmenting skeleton close to the original skeleton, while push far away from the other skeletons. Experimental results on both NTU RGB+D and North-Western UCLA datasets show that the proposed X-CAR achieves better accuracy compared with other competitive methods in the semi-supervised scenario.

引用

页码：3852 / 3867

页数：16

共 50 条

[41] Semi-supervised contrastive learning with decomposition-based data augmentation for time series classification [J].

Kim, Dokyun ;

Cho, Sukhyun ;

Chae, Heewoong ;

Park, Jonghun ;

Huh, Jaeseok .

INTELLIGENT DATA ANALYSIS, 2025, 29 (01) :94-115

[42] Evaluation of semi-supervised learning method on action recognition [J].

Shen, Haoquan ;

Yan, Yan ;

Xu, Shicheng ;

Ballas, Nicolas ;

Chen, Wenzhi .

MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (02) :523-542

[43] Evaluation of semi-supervised learning method on action recognition [J].

Haoquan Shen ;

Yan Yan ;

Shicheng Xu ;

Nicolas Ballas ;

Wenzhi Chen .

Multimedia Tools and Applications, 2015, 74 :523-542

[44] A Transformer-Based Contrastive Semi-Supervised Learning Framework for Automatic Modulation Recognition [J].

Kong, Weisi ;

Jiao, Xun ;

Xu, Yuhua ;

Zhang, Bolin ;

Yang, Qinghai .

IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2023, 9 (04) :950-962

[45] Learning shape and motion representations for view invariant skeleton-based action recognition [J].

Li, Yanshan ;

Xia, Rongjie ;

Liu, Xing .

PATTERN RECOGNITION, 2020, 103

[46] Semi-supervised Wafer Map Pattern Recognition using Domain-Specific Data Augmentation and Contrastive Learning [J].

Hu, Hanbin ;

He, Chen ;

Li, Peng .

2021 IEEE INTERNATIONAL TEST CONFERENCE (ITC 2021), 2021, :113-122

[47] Integrating pseudo labeling with contrastive clustering for transformer-based semi-supervised action recognition [J].

Li, Nannan ;

Huang, Kan ;

Wu, Qingtian ;

Zhao, Yang .

APPLIED INTELLIGENCE, 2024, 54 (22) :11177-11195

[48] Learning Representations by Contrastive Spatio-Temporal Clustering for Skeleton-Based Action Recognition [J].

Wang, Mingdao ;

Li, Xueming ;

Chen, Siqi ;

Zhang, Xianlin ;

Ma, Lei ;

Zhang, Yue .

IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 :3207-3220

[49] Reconstruction-driven contrastive learning for unsupervised skeleton-based human action recognition [J].

Liu, Xing ;

Gao, Bo .

JOURNAL OF SUPERCOMPUTING, 2025, 81 (01)

[50] Adaptive Feature Selection With Reinforcement Learning for Skeleton-Based Action Recognition [J].

Xu, Zheyuan ;

Wang, Yingfu ;

Jiang, Jiaqin ;

Yao, Jian ;

Li, Liang .

IEEE ACCESS, 2020, 8 :213038-213051

← 1 2 3 4 5 →