CdCLR: Clip- Driven Contrastive Learning for Skeleton-Based Action Recognition

被引:1
作者
Gao, Rong [1 ]
Liu, Xin [1 ,2 ]
Yang, Jingyu [1 ]
Yue, Huanjing [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China
[2] Lappeenranta Lahti Univ Technol LUT, Comp Vision & Pattern Recognit Lab, Lappeenranta, Finland
来源
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP) | 2022年
基金
中国国家自然科学基金;
关键词
Unsupervised skeleton-based action recognition; contrastive learning; sequence supervision; deep learning;
D O I
10.1109/VCIP56404.2022.10008837
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we propose a Clip-Driven Contrastive Learning for Skeleton- Based Action Recognition (CdCLR). Instead of considering sequences as instances, CdCLR extracts clips from the sequences as new instances. Aim to implement inherent supervision-guided contrastive learning through joint optimal training of sequences discrimination, clips discrimination, and order verification. Mining abundant positive/negative pairs inside sequence while learning inter- and intra-sequence semantic representations. Extensive experiments on the NTU RGB+D 60, UCLA and iMiGUE datasets present that CdCLR exhibits superior performance under various evaluation protocols and reaches state-of-the-art. Our code is available at https://github.comlErich-G/CdCLR/.
引用
收藏
页数:5
相关论文
共 35 条
[1]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[2]  
Chaolong L., 2018, 32 AAAI C ARTI FICIA
[3]   Temporal Hierarchical Dictionary Guided Decoding for Online Gesture Segmentation and Recognition [J].
Chen, Haoyu ;
Liu, Xin ;
Shi, Jingang ;
Zhao, Guoying .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :9689-9702
[4]  
Chen T., 2020, Advances in neural information processing systems, P22243
[5]  
Chen T, 2020, PR MACH LEARN RES, V119
[6]  
Chen XL, 2020, Arxiv, DOI arXiv:2003.04297
[7]   Skeleton-Based Action Recognition with Shift Graph Convolutional Network [J].
Cheng, Ke ;
Zhang, Yifan ;
He, Xiangyu ;
Chen, Weihan ;
Cheng, Jian ;
Lu, Hanqing .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :180-189
[8]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[9]  
Guo T., 2022, P AAAI C ARTIFICIAL
[10]   Momentum Contrast for Unsupervised Visual Representation Learning [J].
He, Kaiming ;
Fan, Haoqi ;
Wu, Yuxin ;
Xie, Saining ;
Girshick, Ross .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9726-9735