DEEP SELECTIVE FEATURE LEARNING FOR ACTION RECOGNITION

被引:4
作者
Li, Ziqiang [1 ]
Ge, Yongxin [1 ]
Feng, Jinyuan [1 ]
Qi, Xiaolei [1 ]
Yu, Jiaruo [1 ]
Yu, Hui [2 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400030, Peoples R China
[2] Univ Portsmouth, Sch Creat Technol, Portsmouth, Hants, England
来源
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME) | 2020年
基金
中国国家自然科学基金;
关键词
action recognition; feature selection; reinforcement learning;
D O I
10.1109/icme46284.2020.9102727
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Soft-attention mechanism has attracted a lot of attention in recent years due to its ability to capture the most discriminative image features for understanding actions. However, soft-attention tends to focus on fine-grained parts on images and ignores global information, which can lead to totally wrong classification results. To address this issue, we propose a novel deep selective feature learning network (DSFNet), which can automatically learn the feature maps with both fine-grained and global information. Specially, DSFNet is designed to have the ability to learn to adjust the actions for feature map selection by maximizing the cumulative discounted rewards. Moreover, the DSFNet is an easy-to-use extension of state-of-the-art base architectures of multiple tasks. Extensive experiments show that the proposed method has achieved superior performance on two standard action recognition benchmarks across still images (PPMI) and videos (HMDB51).
引用
收藏
页数:6
相关论文
共 21 条
  • [1] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [2] Convolutional Two-Stream Network Fusion for Video Action Recognition
    Feichtenhofer, Christoph
    Pinz, Axel
    Zisserman, Andrew
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941
  • [3] Girdhar R., 2017, ADV NEURAL INFORM PR, P34
  • [4] GIRDHAR R, 2017, P IEEE C COMP VIS PA, P971, DOI DOI 10.1109/CVPR.2017.337
  • [5] IRLAS: Inverse Reinforcement Learning for Architecture Search
    Guo, Minghao
    Zhong, Zhao
    Wu, Wei
    Lin, Dahua
    Yan, Junjie
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9013 - 9021
  • [6] Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition
    Gupta, Abhinav
    Kembhavi, Aniruddha
    Davis, Larry S.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (10) : 1775 - 1789
  • [7] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [8] Heng Wang, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3169, DOI 10.1109/CVPR.2011.5995407
  • [9] Kuehne H, 2011, IEEE I CONF COMP VIS, P2556, DOI 10.1109/ICCV.2011.6126543
  • [10] Li Y, 2017, Deep Reinforcement Learning: An Overview"