Cross-Model Cross-Stream Learning for Self-Supervised Human Action Recognition

被引:2
作者
Liu, Mengyuan [1 ]
Liu, Hong [1 ]
Guo, Tianyu [1 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, State Key Lab Gen Artificial Intelligence, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Multistream; self-supervised learning; skeleton-based action recognition; NETWORKS; FUSION; LSTM;
D O I
10.1109/THMS.2024.3467334
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Considering the instance-level discriminative ability, contrastive learning methods, including MoCo and SimCLR, have been adapted from the original image representation learning task to solve the self-supervised skeleton-based action recognition task. These methods usually use multiple data streams (i.e., joint, motion, and bone) for ensemble learning, meanwhile, how to construct a discriminative feature space within a single stream and effectively aggregate the information from multiple streams remains an open problem. To this end, this article first applies a new contrastive learning method called bootstrap your own latent (BYOL) to learn from skeleton data, and then formulate SkeletonBYOL as a simple yet effective baseline for self-supervised skeleton-based action recognition. Inspired by SkeletonBYOL, this article further presents a cross-model and cross-stream (CMCS) framework. This framework combines cross-model adversarial learning (CMAL) and cross-stream collaborative learning (CSCL). Specifically, CMAL learns single-stream representation by cross-model adversarial loss to obtain more discriminative features. To aggregate and interact with multistream information, CSCL is designed by generating similarity pseudolabel of ensemble learning as supervision and guiding feature generation for individual streams. Extensive experiments on three datasets verify the complementary properties between CMAL and CSCL and also verify that the proposed method can achieve better results than state-of-the-art methods using various evaluation protocols.
引用
收藏
页码:743 / 752
页数:10
相关论文
共 65 条
  • [1] Fuzzy Integral-Based CNN Classifier Fusion for 3D Skeleton Action Recognition
    Banerjee, Avinandan
    Singh, Pawan Kumar
    Sarkar, Ram
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (06) : 2206 - 2216
  • [2] Intent Prediction in Human-Human Interactions
    Baruah, Murchana
    Banerjee, Bonny
    Nagar, Atulya K. K.
    [J]. IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2023, 53 (02) : 458 - 463
  • [3] How and What to Learn: Taxonomizing Self-Supervised Learning for 3D Action Recognition
    Ben Tanfous, Amor
    Zerroug, Aimen
    Linsley, Drew
    Serre, Thomas
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2888 - 2897
  • [4] Skeleton-Based Action Recognition With Gated Convolutional Neural Networks
    Cao, Congqi
    Lan, Cuiling
    Zhang, Yifan
    Zeng, Wenjun
    Lu, Hanqing
    Zhang, Yanning
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (11) : 3247 - 3257
  • [5] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
    Cao, Zhe
    Hidalgo, Gines
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186
  • [6] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition
    Chen, Tailin
    Zhou, Desen
    Wang, Jian
    Wang, Shidong
    Guan, Yu
    He, Xuming
    Ding, Errui
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4334 - 4342
  • [7] Chen X., 2020, ARXIV
  • [8] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition
    Chen, Yuxin
    Zhang, Ziqi
    Yuan, Chunfeng
    Li, Bing
    Deng, Ying
    Hu, Weiming
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13339 - 13348
  • [9] Cheng Y.-B., 2021, 2021 IEEE INT C MULT, P1, DOI DOI 10.1109/ICME51207.2021.9428459
  • [10] Du Y, 2015, PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, P579, DOI 10.1109/ACPR.2015.7486569