Task Adaptive Modeling for Few-shot Action Recognition

被引:0
作者
Wang, Jiayi [1 ]
Jin, Yi [1 ]
Feng, Songhe [1 ]
Li, Yidong [1 ]
机构
[1] Beijing JiaoTong Univ, Sch Comp & Informat Technol, Beijing, Peoples R China
来源
2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2022年
基金
中国国家自然科学基金;
关键词
action recognition; few-shot learning; task adaptive; video classification;
D O I
10.1109/MMSP55362.2022.9949513
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Collecting action recognition datasets is time-consuming and labor-intensive. To solve this problem, a few-shot action recognition task that uses episode training to learn the model appears. However, due to the randomness of few-shot learning task sampling, there are great differences between each task, and the characteristics of classes are also diverse. Most of the current methods simply use the same processing flow for action recognition, ignoring the correlation between tasks. To solve this problem, we propose a task adaptive network for few-shot action recognition, which utilizes the dependency of support set and query set categories. Our method mainly includes two key points: Firstly, we add an attention module after the feature extraction module, which can use the attention mechanism to focus the obtained feature representation on more important local information. Secondly, we design a task adaptive module, which uses the support set samples to strengthen all samples of the current task. The module strengthens the common features within each class of the support set and expands the query set to highlight the differences between classes. We have conducted a large number of experiments on two commonly used action recognition data sets: HMDB51 and UCF101. The results of experiments show that our method has strong competitiveness and performs well in the field of few-shot action recognition.
引用
收藏
页数:6
相关论文
共 29 条
  • [1] Bishay M., 2019, BMVC
  • [2] Cao KD, 2020, PROC CVPR IEEE, P10615, DOI 10.1109/CVPR42600.2020.01063
  • [3] Chikontwe P., 2022, ARXIV
  • [4] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [5] ProtoGAN: Towards Few Shot Learning for Action Recognition
    Dwivedi, Sai Kumar
    Gupta, Vikram
    Mitra, Rahul
    Ahmed, Shuaib
    Jain, Arjun
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1308 - 1316
  • [6] SlowFast Networks for Video Recognition
    Feichtenhofer, Christoph
    Fan, Haoqi
    Malik, Jitendra
    He, Kaiming
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6201 - 6210
  • [7] Digging Into Self-Supervised Monocular Depth Estimation
    Godard, Clement
    Mac Aodha, Oisin
    Firman, Michael
    Brostow, Gabriel
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3827 - 3837
  • [8] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [9] Hongjie Zhang, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12348), P102, DOI 10.1007/978-3-030-58580-8_7
  • [10] Squeeze-and-Excitation Networks
    Hu, Jie
    Shen, Li
    Albanie, Samuel
    Sun, Gang
    Wu, Enhua
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (08) : 2011 - 2023