Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition

被引:14
|
作者
Li, Tianjiao [1 ]
Foo, Lin Geng [1 ]
Ke, Qiuhong [2 ]
Rahmani, Hossein [3 ]
Wang, Anran [4 ]
Wang, Jinghua [5 ]
Liu, Jun [1 ]
机构
[1] Singapore Univ Technol & Design, ISTD Pillar, Singapore, Singapore
[2] Monash Univ, Dept Data Sci & AI, Melbourne, Vic, Australia
[3] Univ Lancaster, Sch Comp & Commun, Lancaster, England
[4] ByteDance, Beijing, Peoples R China
[5] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China
来源
基金
新加坡国家研究基金会;
关键词
Action recognition; Fine-grained; Dynamic neural networks; HUMAN NEURAL SYSTEM; FACE; REPRESENTATIONS; IDENTITY;
D O I
10.1007/978-3-031-19772-7_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of fine-grained action recognition is to successfully discriminate between action categories with subtle differences. To tackle this, we derive inspiration from the human visual system which contains specialized regions in the brain that are dedicated towards handling specific tasks. We design a novel Dynamic Spatio-Temporal Specialization (DSTS) module, which consists of specialized neurons that are only activated for a subset of samples that are highly similar. During training, the loss forces the specialized neurons to learn discriminative fine-grained differences to distinguish between these similar samples, improving fine-grained recognition. Moreover, a spatio-temporal specialization method further optimizes the architectures of the specialized neurons to capture either more spatial or temporal fine-grained information, to better tackle the large range of spatio-temporal variations in the videos. Lastly, we design an Upstream-Downstream Learning algorithm to optimize our model's dynamic decisions during training, improving the performance of our DSTS module. We obtain state-of-the-art performance on two widely-used fine-grained action recognition datasets.
引用
收藏
页码:386 / 403
页数:18
相关论文
共 50 条
  • [1] Learning to Represent Spatio-Temporal Features for Fine Grained Action Recognition
    Sakhalkar, Kaustubh
    Bremond, Francois
    2018 IEEE THIRD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, APPLICATIONS AND SYSTEMS (IPAS), 2018, : 268 - 272
  • [2] ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning
    Kordopatis-Zilos, Giorgos
    Papadopoulos, Symeon
    Patras, Ioannis
    Kompatsiaris, Ioannis
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6360 - 6369
  • [3] Fine-Grained Spatio-Temporal Parsing Network for Action Quality Assessment
    Gedamu, Kumie
    Ji, Yanli
    Yang, Yang
    Shao, Jie
    Shen, Heng Tao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6386 - 6400
  • [4] ENFIRE: A Spatio-Temporal Fine-Grained Reconfigurable Hardware
    Qian, Wenchao
    Babecki, Christopher
    Karam, Robert
    Paul, Somnath
    Bhunia, Swarup
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (01) : 177 - 188
  • [5] FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing
    Zhang, Mingyuan
    Li, Huirong
    Cai, Zhongang
    Ren, Jiawei
    Yang, Lei
    Liu, Ziwei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Spatio-Temporal Detection of Fine-Grained Dyadic Human Interactions
    van Gemeren, Coert
    Poppe, Ronald
    Veltkamp, Remco C.
    HUMAN BEHAVIOR UNDERSTANDING, 2016, 9997 : 116 - 133
  • [7] Fine-grained action recognition using dynamic kernels
    Yenduri, Sravani
    Perveen, Nazil
    Chalavadi, Vishnu
    Mohan, Krishna C.
    PATTERN RECOGNITION, 2022, 122
  • [8] Predicting Fine-Grained Traffic Conditions via Spatio-Temporal LSTM
    Wei, Xiaojuan
    Li, Jinglin
    Yuan, Quan
    Chen, Kaihui
    Zhou, Ao
    Yang, Fangchun
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2019, 2019
  • [9] LEARNING SPATIO-TEMPORAL DEPENDENCIES FOR ACTION RECOGNITION
    Cai, Qiao
    Yin, Yafeng
    Man, Hong
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 3740 - 3744
  • [10] Learning Convolutional Action Primitives for Fine-grained Action Recognition
    Lea, Colin
    Vidal, Rene
    Hager, Gregory D.
    2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 1642 - 1649