Representation modeling learning with multi-domain decoupling for unsupervised skeleton-based action recognition

被引：0

作者：

He, Zhiquan ^{[1
,2
]}

Lv, Jiantu ^{[2
]}

Fang, Shizhang ^{[2
]}

机构：

[1] Guangdong Key Lab Intelligent Informat Proc, Shenzhen, Peoples R China

[2] Shenzhen Univ, Guangdong Multimedia Informat Serv Engn Technol Re, Shenzhen, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 582卷

基金：

中国国家自然科学基金;

关键词：

Unsupervised learning; Contrastive learning; Action recognition;

D O I：

10.1016/j.neucom.2024.127495

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Skeleton-based action recognition is one of the basic researches in computer vision. In recent years, the unsupervised contrastive learning paradigm has achieved great success in skeleton-based action recognition. However, previous work often treated input skeleton sequences as a whole when performing comparisons, lacking fine-grained representation contrast learning. Therefore, we propose a contrastive learning method for Representation Modeling with Multi-domain D ecoupling (RMMD), which extracts the most significant representations from input skeleton sequences in the temporal domain, spatial domain and frequency domain, respectively. Specifically, in the temporal and spatial domains, we propose a multi-level spatiotemporal mining reconstruction module (STMR) that iteratively reconstructs the original input skeleton sequences to highlight spatiotemporal representations under different actions. At the same time, we introduce position encoding and a global adaptive attention matrix, balancing both global and local information, and effectively modeling the spatiotemporal dependencies between joints. In the frequency domain, we use the discrete cosine transform (DCT) to achieve temporal-frequency conversion, discard part of the interference information, and use the frequency self-attention (FSA) and multi-level aggregation perceptron (MLAP) to deeply explore the frequency domain representation. The fusion of the temporal domain, spatial domain and frequency domain representations makes our model more discriminative in representing different actions. Besides, we verify the effectiveness of the model on the NTU RGB+D and PKU-MMD datasets. Extensive experiments show that our method outperforms existing unsupervised methods and achieves significant performance improvements in downstream tasks such as action recognition and action retrieval.

引用

页数：11

共 50 条

[1] Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
Lin, Lilang
Wu, Lehong
Zhang, Jiahang
Wang, Jiaying
COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 75 - 92
[2] Progressive semantic learning for unsupervised skeleton-based action recognition
Qin, Hao
Chen, Luyuan
Kong, Ming
Zhao, Zhuoran
Zeng, Xianzhou
Lu, Mengxu
Zhu, Qiang
MACHINE LEARNING, 2025, 114 (03)
[3] EnsCLR: Unsupervised skeleton-based action recognition via ensemble contrastive learning of representation
Wang, Kun
Cao, Jiuxin
Cao, Biwei
Liu, Bo
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 247
[4] Unsupervised skeleton-based action representation learning via relation consistency pursuit
Wenjing Zhang
Yonghong Hou
Haoyuan Zhang
Neural Computing and Applications, 2022, 34 : 20327 - 20339
[5] Unsupervised skeleton-based action representation learning via relation consistency pursuit
Zhang, Wenjing
Hou, Yonghong
Zhang, Haoyuan
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (22): : 20327 - 20339
[6] Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition
Du, Yong
Fu, Yun
Wang, Liang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3010 - 3022
[7] Reconstruction-driven contrastive learning for unsupervised skeleton-based human action recognition
Liu, Xing
Gao, Bo
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[8] Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition
Liu, Hongda
Wang, Yunlong
Ren, Min
Hu, Junxing
Luo, Zhengquan
Hou, Guangqi
Sun, Zhenan
MACHINE INTELLIGENCE RESEARCH, 2025,
[9] Unsupervised Temporal Adaptation in Skeleton-Based Human Action Recognition
Tian, Haitao
Payeur, Pierre
ALGORITHMS, 2024, 17 (12)
[10] Robust Multi-Feature Learning for Skeleton-Based Action Recognition
Wang, Yingfu
Xu, Zheyuan
Li, Li
Yao, Jian
IEEE ACCESS, 2019, 7 : 148658 - 148671

← 1 2 3 4 5 →