Motion-modulated Temporal Fragment Alignment Network For Few-Shot Action Recognition

被引：33

作者：

Wu, Jiamin ^{[1
]}

Zhang, Tianzhu ^{[1
]}

Zhang, Zhe ^{[2
]}

Wu, Feng ^{[1
]}

Zhang, Yongdong ^{[1
]}

机构：

[1] Univ Sci & Technol China, Hefei, Peoples R China

[2] Lunar Explorat & Space Engn Ctr CNSA, Beijing, Peoples R China

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00894

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While the majority of FSL models focus on image classification, the extension to action recognition is rather challenging due to the additional temporal dimension in videos. To address this issue, we propose an end-to-end Motion-modulated Temporal Fragment Alignment Network (MTFAN) by jointly exploring the task-specific motion modulation and the multi-level temporal fragment alignment for Few-Shot Action Recognition (FSAR). The proposed MTFAN model enjoys several merits. First, we design a motion modulator conditioned on the learned task-specific motion embeddings, which can activate the channels related to the task-shared motion patterns for each frame. Second, a segment attention mechanism is proposed to automatically discover the higher-level segments for multi-level temporal fragment alignment, which encompasses the frame-to-frame, segment-to-segment, and segment-to-frame alignments. To the best of our knowledge, this is the first work to exploit task-specific motion modulation for FSAR. Extensive experimental results on four standard benchmarks demonstrate that the proposed model performs favorably against the state-of-the-art FSAR methods.

引用

页码：9141 / 9150

页数：10

共 50 条

[1] Elastic temporal alignment for few-shot action recognition
Pan, Fei
Xu, Chunlei
Zhang, Hongjie
Guo, Jie
Guo, Yanwen
IET COMPUTER VISION, 2023, 17 (01) : 39 - 50
[2] Searching for Better Spatio-temporal Alignment in Few-Shot Action Recognition
Cao, Yichao
Su, Xiu
Tang, Qingfei
You, Shan
Lu, Xiaobo
Xu, Chang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[3] Few-shot action recognition with implicit temporal alignment and pair similarity optimization
Cao, Congqi
Li, Yajuan
Lv, Qinyi
Wang, Peng
Zhang, Yanning
COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 210
[4] TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition
Ben-Ari, Rami
Nacson, Mor Shpigel
Azulai, Ophir
Barzelay, Udi
Rotman, Daniel
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2780 - 2788
[5] FTAN: Frame-to-frame temporal alignment network with contrastive learning for few-shot action recognition
Yu, Bin
Hou, Yonghong
Guo, Zihui
Gao, Zhiyi
Li, Yueyang
IMAGE AND VISION COMPUTING, 2024, 149
[6] Enhancing Few-Shot Action Recognition Using Skeleton Temporal Alignment and Adversarial Training
Xu, Qingyang
Yang, Jianjun
Zhang, Hongyi
Jie, Xin
Bandara, Danushka
IEEE ACCESS, 2024, 12 : 31745 - 31755
[7] Temporal-Relational CrossTransformers for Few-Shot Action Recognition
Perrett, Toby
Masullo, Alessandro
Burghardt, Tilo
Mirmehdi, Majid
Damen, Dima
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 475 - 484
[8] Revisiting the Spatial and Temporal Modeling for Few-Shot Action Recognition
Xing, Jiazheng
Wang, Mengmeng
Liu, Yong
Mu, Boyu
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3001 - 3009
[9] Multi-level alignment for few-shot temporal action localization
Keisham, Kanchan
Jalali, Amin
Kim, Jonghong
Lee, Minho
INFORMATION SCIENCES, 2023, 650
[10] Hierarchical Motion Excitation Network for Few-Shot Video Recognition
Wang, Bing
Wang, Xiaohua
Ren, Shiwei
Wang, Weijiang
Shi, Yueting
ELECTRONICS, 2023, 12 (05)

← 1 2 3 4 5 →