SMAM: Self and Mutual Adaptive Matching for Skeleton-Based Few-Shot Action Recognition

被引:13
作者
Li, Zhiheng [1 ]
Gong, Xuyuan [1 ]
Song, Ran [1 ]
Duan, Peng [2 ]
Liu, Jun [3 ]
Zhang, Wei [1 ]
机构
[1] Shandong Univ, Sch Control Sci & Engn, Jinan, Peoples R China
[2] Liaocheng Univ, Sch Comp Sci, Liaocheng, Peoples R China
[3] Singapore Univ Technol & Design, Informat Syst Technol & Design Pillar, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
Skeleton-based; action recognition; few-shot learning;
D O I
10.1109/TIP.2022.3226410
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on skeleton-based few-shot action recognition. Since skeleton is essentially a sparse representation of human action, the feature maps extracted from it, through a standard encoder network in the few-shot condition, may not be sufficiently discriminative for some action sequences that look partially similar to each other. To address this issue, we propose a self and mutual adaptive matching (SMAM) module to convert such feature maps into more discriminative feature vectors. Our method, named as SMAM-Net, first leverages both the temporal information associated with each individual skeleton joint and the spatial relationship among them for feature extraction. Then, the SMAM module adaptively measures the similarity between labeled and query samples and further carries out feature matching within the query set to distinguish similar skeletons of various action categories. Experimental results show that the SMAM-Net outperforms other baselines on the large-scale NTU RGB + D 120 dataset in the tasks of one-shot and five-shot action recognition. We also report our results on smaller datasets including NTU RGB + D 60, SYSU and PKU-MMD to demonstrate that our method is reliable and generalises well on different datasets. Codes and the pretrained SMAM-Net will be made publicly available.
引用
收藏
页码:392 / 402
页数:11
相关论文
共 48 条
  • [31] Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
    Qiu, Zhaofan
    Yao, Ting
    Mei, Tao
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5534 - 5542
  • [32] One-shot action recognition in challenging therapy scenarios
    Sabater, Alberto
    Santos, Laura
    Santos-Victor, Jose
    Bernardino, Alexandre
    Montesano, Luis
    Murillo, Ana C.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2771 - 2779
  • [33] NTU RGB plus D: A Large Scale Dataset for 3D Human Activity Analysis
    Shahroudy, Amir
    Liu, Jun
    Ng, Tian-Tsong
    Wang, Gang
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1010 - 1019
  • [34] Skeleton-Based Action Recognition with Directed Graph Neural Networks
    Shi, Lei
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7904 - 7913
  • [35] Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition
    Shi, Lei
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12018 - 12027
  • [36] Adaptive Subspaces for Few-Shot Learning
    Simon, Christian
    Koniusz, Piotr
    Nock, Richard
    Harandi, Mehrtash
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4135 - 4144
  • [37] Snell J, 2017, Arxiv, DOI arXiv:1703.05175
  • [38] Learning to Compare: Relation Network for Few-Shot Learning
    Sung, Flood
    Yang, Yongxin
    Zhang, Li
    Xiang, Tao
    Torr, Philip H. S.
    Hospedales, Timothy M.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1199 - 1208
  • [39] Deep Progressive Reinforcement Learning for Skeleton-based Action Recognition
    Tang, Yansong
    Tian, Yi
    Lu, Jiwen
    Li, Peiyang
    Zhou, Jie
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5323 - 5332
  • [40] Spatio-temporal Relation Modeling for Few-shot Action Recognition
    Thatipelli, Anirudh
    Narayan, Sanath
    Khan, Salman
    Anwer, Rao Muhammad
    Khan, Fahad Shahbaz
    Ghanem, Bernard
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19926 - 19935