Enhancing Human Action Recognition through Temporal Saliency

被引:0
|
作者
Adeli, Vida [1 ]
Fazl-Ersi, Ehsan [1 ]
Harati, Ahad [1 ]
机构
[1] Ferdowsi Univ Mashhad, Dept Comp Engn, Mashhad, Razavi Khorasan, Iran
来源
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE (ICPRAI 2018) | 2018年
关键词
Action recognition; Motion; Region proposal; Convolutional Neural Networks; Actionness;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Images and videos have become ubiquitous in every aspects of life due to the growing digital recording devices. It has encouraged the development of algorithms that can analyze video content and perform human action recognition. This paper investigates the challenging problem of action recognition by outlining a new approach to represent a video sequence. A novel framework is developed to produce informative features for action labeling in a weakly-supervised learning (WSL) approach both during training and testing. Using appearance and motion information, the goal is to identify frame regions that are likely to contain actions. A three-stream convolutional neural network is adopted and improved by proposing a method based on extracting actionness regions. This results in less computation as it is processing only some parts of an RGB frame and also interpret less non-activity related regions, which can mislead the recognition system. We exploit UCF sports dataset as our evaluation benchmark, which is a dataset of realistic sports videos. We will show that our proposed approach could outperform other existing state-of-the art methods.
引用
收藏
页码:176 / 181
页数:6
相关论文
共 50 条
  • [31] End-to-end temporal attention extraction and human action recognition
    Zhang, Hong
    Xin, Miao
    Wang, Shuhang
    Yang, Yifan
    Zhang, Lei
    Wang, Helong
    MACHINE VISION AND APPLICATIONS, 2018, 29 (07) : 1127 - 1142
  • [32] Bag of Spatio-temporal Synonym Sets for Human Action Recognition
    Pang, Lin
    Cao, Juan
    Guo, Junbo
    Lin, Shouxun
    Song, Yan
    ADVANCES IN MULTIMEDIA MODELING, PROCEEDINGS, 2010, 5916 : 422 - 432
  • [33] Spatio-Temporal Information Fusion and Filtration for Human Action Recognition
    Zhang, Man
    Li, Xing
    Wu, Qianhan
    SYMMETRY-BASEL, 2023, 15 (12):
  • [34] End-to-end temporal attention extraction and human action recognition
    Hong Zhang
    Miao Xin
    Shuhang Wang
    Yifan Yang
    Lei Zhang
    Helong Wang
    Machine Vision and Applications, 2018, 29 : 1127 - 1142
  • [35] Spatio-Temporal VLAD Encoding for Human Action Recognition in Videos
    Duta, Ionut C.
    Ionescu, Bogdan
    Aizawa, Kiyoharu
    Sebe, Nicu
    MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 365 - 378
  • [36] Local Feature Fusion Temporal Convolutional Network for Human Action Recognition
    Song Z.
    Zhou Y.
    Jia J.
    Xin S.
    Liu Y.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (03): : 418 - 424
  • [37] SPARSE CODING-BASED SPATIOTEMPORAL SALIENCY FOR ACTION RECOGNITION
    Zhang, Tao
    Xu, Long
    Yang, Jie
    Shi, Pengfei
    Jia, Wenjing
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 2045 - 2049
  • [38] Saliency guided local and global descriptors for effective action recognition
    Ashwan Abdulmunem
    Yu-Kun Lai
    Xianfang Sun
    ComputationalVisualMedia, 2016, 2 (01) : 97 - 106
  • [39] Motion saliency based hierarchical attention network for action recognition
    Guo, Zihui
    Hou, Yonghong
    Xiao, Renyi
    Li, Chuankun
    Li, Wanqing
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (03) : 4533 - 4550
  • [40] SPATIO-TEMPORAL PYRAMIDAL ACCORDION REPRESENTATION FOR HUMAN ACTION RECOGNITION
    Sekma, Manel
    Mejdoub, Mahmoud
    Ben Amar, Chokri
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,