DMMG: Dual Min-Max Games for Self-Supervised Skeleton-Based Action Recognition

被引：5

作者：

Guan, Shannan ^{[1
]}

Yu, Xin ^{[2
]}

Huang, Wei ^{[3
]}

Fang, Gengfa ^{[4
]}

Lu, Haiyan ^{[1
]}

机构：

[1] Univ Technol Sydney, Australian Artificial Intelligence Inst, Ultimo, NSW 2007, Australia

[2] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia

[3] RIKEN Ctr Adv Intelligence Project, Tokyo 1030027, Japan

[4] Univ Technol Sydney, Sch Elect & Data Engn, Ultimo, NSW 2007, Australia

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

关键词：

Self-supervised learning; adversarial learning; contrastive learning; skeleton action recognition; min-max game;

D O I：

10.1109/TIP.2023.3338410

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we propose a new Dual Min-Max Games (DMMG) based self-supervised skeleton action recognition method by augmenting unlabeled data in a contrastive learning framework. Our DMMG consists of a viewpoint variation min-max game and an edge perturbation min-max game. These two min-max games adopt an adversarial paradigm to perform data augmentation on the skeleton sequences and graph-structured body joints, respectively. Our viewpoint variation min-max game focuses on constructing various hard contrastive pairs by generating skeleton sequences from various viewpoints. These hard contrastive pairs help our model learn representative action features, thus facilitating model transfer to downstream tasks. Moreover, our edge perturbation min-max game specializes in building diverse hard contrastive samples through perturbing connectivity strength among graph-based body joints. The connectivity-strength varying contrastive pairs enable the model to capture minimal sufficient information of different actions, such as representative gestures for an action while preventing the model from overfitting. By fully exploiting the proposed DMMG, we can generate sufficient challenging contrastive pairs and thus achieve discriminative action feature representations from unlabeled skeleton data in a self-supervised manner. Extensive experiments demonstrate that our method achieves superior results under various evaluation protocols on widely-used NTU-RGB+D, NTU120-RGB+D and PKU-MMD datasets.

引用

页码：395 / 407

页数：13

共 50 条

[21] Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition [J].

Lin, Lilang ;

Wu, Lehong ;

Zhang, Jiahang ;

Wang, Jiaying .

COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 :75-92

[22] MS2L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition [J].

Lin, Lilang ;

Song, Sijie ;

Yang, Wenhan ;

Liu, Jiaying .

MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, :2490-2498

[23] Self-Supervised Action Representation Learning Based on Asymmetric Skeleton Data Augmentation [J].

Zhou, Hualing ;

Li, Xi ;

Xu, Dahong ;

Liu, Hong ;

Guo, Jianping ;

Zhang, Yihan .

SENSORS, 2022, 22 (22)

[24] Collaboratively Self-Supervised Video Representation Learning for Action Recognition [J].

Zhang, Jie ;

Wan, Zhifan ;

Hu, Lanqing ;

Lin, Stephen ;

Wu, Shuzhe ;

Shan, Shiguang .

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 :1895-1907

[25] Language-Skeleton Pre-training to Collaborate with Self-Supervised Human Action Recognition [J].

Liu, Yi ;

Liu, Ruyi ;

Xin, Wentian ;

Miao, Qiguang ;

Hu, Yuzhi ;

Qi, Jiahao .

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 :409-423

[26] Skeleton MixFormer: Multivariate Topology Representation for Skeleton-based Action Recognition [J].

Xin, Wentian ;

Miao, Qiguang ;

Liu, Yi ;

Liu, Ruyi ;

Pun, Chi-Man ;

Shi, Cheng .

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, :2211-2220

[27] Exploring Supervised Contrastive Learning for Skeleton-Based Temporal Action Segmentation [J].

Chen, Bowen ;

Ji, Haoyu ;

Ma, Hanwei ;

Lin, Ruihan ;

Nie, Wei ;

Ren, Weihong ;

Wang, Zhiyong ;

Liu, Honghai .

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2025, 17 (04) :964-975

[28] Graph-aware transformer for skeleton-based action recognition [J].

Zhang, Jiaxu ;

Xie, Wei ;

Wang, Chao ;

Tu, Ruide ;

Tu, Zhigang .

VISUAL COMPUTER, 2023, 39 (10) :4501-4512

[29] Unsupervised Temporal Adaptation in Skeleton-Based Human Action Recognition [J].

Tian, Haitao ;

Payeur, Pierre .

ALGORITHMS, 2024, 17 (12)

[30] Graph-aware transformer for skeleton-based action recognition [J].

Jiaxu Zhang ;

Wei Xie ;

Chao Wang ;

Ruide Tu ;

Zhigang Tu .

The Visual Computer, 2023, 39 :4501-4512

← 1 2 3 4 5 →