CdCLR: Clip- Driven Contrastive Learning for Skeleton-Based Action Recognition

被引:2
作者
Gao, Rong [1 ]
Liu, Xin [1 ,2 ]
Yang, Jingyu [1 ]
Yue, Huanjing [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China
[2] Lappeenranta Lahti Univ Technol LUT, Comp Vision & Pattern Recognit Lab, Lappeenranta, Finland
来源
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP) | 2022年
基金
中国国家自然科学基金;
关键词
Unsupervised skeleton-based action recognition; contrastive learning; sequence supervision; deep learning;
D O I
10.1109/VCIP56404.2022.10008837
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we propose a Clip-Driven Contrastive Learning for Skeleton- Based Action Recognition (CdCLR). Instead of considering sequences as instances, CdCLR extracts clips from the sequences as new instances. Aim to implement inherent supervision-guided contrastive learning through joint optimal training of sequences discrimination, clips discrimination, and order verification. Mining abundant positive/negative pairs inside sequence while learning inter- and intra-sequence semantic representations. Extensive experiments on the NTU RGB+D 60, UCLA and iMiGUE datasets present that CdCLR exhibits superior performance under various evaluation protocols and reaches state-of-the-art. Our code is available at https://github.comlErich-G/CdCLR/.
引用
收藏
页数:5
相关论文
共 35 条
  • [1] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [2] Chaolong L., 2018, 32 AAAI C ARTI FICIA
  • [3] Temporal Hierarchical Dictionary Guided Decoding for Online Gesture Segmentation and Recognition
    Chen, Haoyu
    Liu, Xin
    Shi, Jingang
    Zhao, Guoying
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 9689 - 9702
  • [4] Chen T., 2020, Advances in neural information processing systems, V33, P22243
  • [5] Chen T, 2020, PR MACH LEARN RES, V119
  • [6] Chen XL, 2020, Arxiv, DOI [arXiv:2003.04297, 10.48550/arXiv.2003.04297]
  • [7] Skeleton-Based Action Recognition with Shift Graph Convolutional Network
    Cheng, Ke
    Zhang, Yifan
    He, Xiangyu
    Chen, Weihan
    Cheng, Jian
    Lu, Hanqing
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 180 - 189
  • [8] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [9] Guo T., 2022, P AAAI C ARTIFICIAL
  • [10] Momentum Contrast for Unsupervised Visual Representation Learning
    He, Kaiming
    Fan, Haoqi
    Wu, Yuxin
    Xie, Saining
    Girshick, Ross
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 9726 - 9735