DEEP SELECTIVE FEATURE LEARNING FOR ACTION RECOGNITION

被引：4

作者：

Li, Ziqiang ^{[1
]}

Ge, Yongxin ^{[1
]}

Feng, Jinyuan ^{[1
]}

Qi, Xiaolei ^{[1
]}

Yu, Jiaruo ^{[1
]}

Yu, Hui ^{[2
]}

机构：

[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400030, Peoples R China

[2] Univ Portsmouth, Sch Creat Technol, Portsmouth, Hants, England

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME) | 2020年

基金：

中国国家自然科学基金;

关键词：

action recognition; feature selection; reinforcement learning;

D O I：

10.1109/icme46284.2020.9102727

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Soft-attention mechanism has attracted a lot of attention in recent years due to its ability to capture the most discriminative image features for understanding actions. However, soft-attention tends to focus on fine-grained parts on images and ignores global information, which can lead to totally wrong classification results. To address this issue, we propose a novel deep selective feature learning network (DSFNet), which can automatically learn the feature maps with both fine-grained and global information. Specially, DSFNet is designed to have the ability to learn to adjust the actions for feature map selection by maximizing the cumulative discounted rewards. Moreover, the DSFNet is an easy-to-use extension of state-of-the-art base architectures of multiple tasks. Extensive experiments show that the proposed method has achieved superior performance on two standard action recognition benchmarks across still images (PPMI) and videos (HMDB51).

引用

页数：6

共 21 条

[1] Learning Spatiotemporal Features with 3D Convolutional Networks
Du Tran
Bourdev, Lubomir
Fergus, Rob
Torresani, Lorenzo
Paluri, Manohar
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
[2] Convolutional Two-Stream Network Fusion for Video Action Recognition
Feichtenhofer, Christoph
Pinz, Axel
Zisserman, Andrew
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941
[3] Girdhar R., 2017, ADV NEURAL INFORM PR, P34
[4] GIRDHAR R, 2017, P IEEE C COMP VIS PA, P971, DOI DOI 10.1109/CVPR.2017.337
[5] IRLAS: Inverse Reinforcement Learning for Architecture Search
Guo, Minghao
Zhong, Zhao
Wu, Wei
Lin, Dahua
Yan, Junjie
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9013 - 9021
[6] Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition
Gupta, Abhinav
Kembhavi, Aniruddha
Davis, Larry S.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (10) : 1775 - 1789
[7] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[8] Heng Wang, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3169, DOI 10.1109/CVPR.2011.5995407
[9] Kuehne H, 2011, IEEE I CONF COMP VIS, P2556, DOI 10.1109/ICCV.2011.6126543
[10] Li Y, 2017, Deep Reinforcement Learning: An Overview"

← 1 2 3 →