Motion-modulated Temporal Fragment Alignment Network For Few-Shot Action Recognition

被引:33
|
作者
Wu, Jiamin [1 ]
Zhang, Tianzhu [1 ]
Zhang, Zhe [2 ]
Wu, Feng [1 ]
Zhang, Yongdong [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Lunar Explorat & Space Engn Ctr CNSA, Beijing, Peoples R China
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年
关键词
D O I
10.1109/CVPR52688.2022.00894
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While the majority of FSL models focus on image classification, the extension to action recognition is rather challenging due to the additional temporal dimension in videos. To address this issue, we propose an end-to-end Motion-modulated Temporal Fragment Alignment Network (MTFAN) by jointly exploring the task-specific motion modulation and the multi-level temporal fragment alignment for Few-Shot Action Recognition (FSAR). The proposed MTFAN model enjoys several merits. First, we design a motion modulator conditioned on the learned task-specific motion embeddings, which can activate the channels related to the task-shared motion patterns for each frame. Second, a segment attention mechanism is proposed to automatically discover the higher-level segments for multi-level temporal fragment alignment, which encompasses the frame-to-frame, segment-to-segment, and segment-to-frame alignments. To the best of our knowledge, this is the first work to exploit task-specific motion modulation for FSAR. Extensive experimental results on four standard benchmarks demonstrate that the proposed model performs favorably against the state-of-the-art FSAR methods.
引用
收藏
页码:9141 / 9150
页数:10
相关论文
共 50 条
  • [1] Elastic temporal alignment for few-shot action recognition
    Pan, Fei
    Xu, Chunlei
    Zhang, Hongjie
    Guo, Jie
    Guo, Yanwen
    IET COMPUTER VISION, 2023, 17 (01) : 39 - 50
  • [2] Searching for Better Spatio-temporal Alignment in Few-Shot Action Recognition
    Cao, Yichao
    Su, Xiu
    Tang, Qingfei
    You, Shan
    Lu, Xiaobo
    Xu, Chang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [3] Few-shot action recognition with implicit temporal alignment and pair similarity optimization
    Cao, Congqi
    Li, Yajuan
    Lv, Qinyi
    Wang, Peng
    Zhang, Yanning
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 210
  • [4] TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition
    Ben-Ari, Rami
    Nacson, Mor Shpigel
    Azulai, Ophir
    Barzelay, Udi
    Rotman, Daniel
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2780 - 2788
  • [5] FTAN: Frame-to-frame temporal alignment network with contrastive learning for few-shot action recognition
    Yu, Bin
    Hou, Yonghong
    Guo, Zihui
    Gao, Zhiyi
    Li, Yueyang
    IMAGE AND VISION COMPUTING, 2024, 149
  • [6] Enhancing Few-Shot Action Recognition Using Skeleton Temporal Alignment and Adversarial Training
    Xu, Qingyang
    Yang, Jianjun
    Zhang, Hongyi
    Jie, Xin
    Bandara, Danushka
    IEEE ACCESS, 2024, 12 : 31745 - 31755
  • [7] Temporal-Relational CrossTransformers for Few-Shot Action Recognition
    Perrett, Toby
    Masullo, Alessandro
    Burghardt, Tilo
    Mirmehdi, Majid
    Damen, Dima
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 475 - 484
  • [8] Revisiting the Spatial and Temporal Modeling for Few-Shot Action Recognition
    Xing, Jiazheng
    Wang, Mengmeng
    Liu, Yong
    Mu, Boyu
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3001 - 3009
  • [9] Multi-level alignment for few-shot temporal action localization
    Keisham, Kanchan
    Jalali, Amin
    Kim, Jonghong
    Lee, Minho
    INFORMATION SCIENCES, 2023, 650
  • [10] Hierarchical Motion Excitation Network for Few-Shot Video Recognition
    Wang, Bing
    Wang, Xiaohua
    Ren, Shiwei
    Wang, Weijiang
    Shi, Yueting
    ELECTRONICS, 2023, 12 (05)