ReadingAct RGB-D action dataset and human action recognition from local features

被引:14
作者
Chen, Lulu [1 ]
Wei, Hong [1 ]
Ferryman, James [1 ]
机构
[1] Univ Reading, Sch Syst Engn, Computat Vis Grp, Reading RG6 6AY, Berks, England
关键词
Human action recognition; Depth sensor; Spatio-temporal local features; Dynamic time warping; ReadingAct action dataset; DENSE;
D O I
10.1016/j.patrec.2013.09.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For general home monitoring, a system should automatically interpret people's actions. The system should be non-intrusive, and able to deal with a cluttered background, and loose clothes. An approach based on spatio-temporal local features and a Bag-of-Words (BoW) model is proposed for single-person action recognition from combined intensity and depth images. To restore the temporal structure lost in the traditional BoW method, a dynamic time alignment technique with temporal binning is applied in this work, which has not been previously implemented in the literature for human action recognition on depth imagery. A novel human action dataset with depth data has been created using two Microsoft Kinect sensors. The ReadingAct dataset contains 20 subjects and 19 actions for a total of 2340 videos. To investigate the effect of using depth images and the proposed method, testing was conducted on three depth datasets, and the proposed method was compared to traditional Bag-of-Words methods. Results showed that the proposed method improves recognition accuracy when adding depth to the conventional intensity data, and has advantages when dealing with long actions. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:159 / 169
页数:11
相关论文
共 50 条
  • [41] Using Local-Based Harris-PHOG Features in a Combination Framework for Human Action Recognition
    Hemati, R.
    Mirzakuchaki, S.
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2014, 39 (02) : 903 - 912
  • [42] The Johns Hopkins University Multimodal Dataset for Human Action Recognition
    Murray, Thomas S.
    Mendat, Daniel R.
    Pouliquen, Philippe O.
    Andreou, Andreas G.
    RADAR SENSOR TECHNOLOGY XIX; AND ACTIVE AND PASSIVE SIGNATURES VI, 2015, 9461
  • [43] A four-stream ConvNet based on spatial and depth flow for human action classification using RGB-D data
    D. Srihari
    P. V. V. Kishore
    E. Kiran Kumar
    D. Anil Kumar
    M. Teja Kiran Kumar
    M. V. D. Prasad
    Ch. Raghava Prasad
    Multimedia Tools and Applications, 2020, 79 : 11723 - 11746
  • [44] Local Surface Geometric Feature for 3D human action recognition
    Zhang, Erhu
    Chen, Wanjun
    Zhang, Zhuomin
    Zhang, Yan
    NEUROCOMPUTING, 2016, 208 : 281 - 289
  • [45] A four-stream ConvNet based on spatial and depth flow for human action classification using RGB-D data
    Srihari, D.
    Kishore, P. V. V.
    Kumar, E. Kiran
    Kumar, D. Anil
    Kumar, M. Teja Kiran
    Prase, M. V. D.
    Prasd, Ch Raghava
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (17-18) : 11723 - 11746
  • [46] Human action recognition with transformer based on convolutional features
    Shi, Chengcheng
    Liu, Shuxin
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2024, 18 (02): : 881 - 896
  • [47] Handcrafted localized phase features for human action recognition
    Hejazi, Seyed Mostafa
    Abhayaratne, Charith
    IMAGE AND VISION COMPUTING, 2022, 123
  • [48] Combining appearance and structural features for human action recognition
    Zhao, Dongjie
    Shao, Ling
    Zhen, Xiantong
    Liu, Yan
    NEUROCOMPUTING, 2013, 113 : 88 - 96
  • [49] Fusing mixed visual features for human action recognition
    Tang, Chao
    Zhou, Changle
    Pan, Wei
    Xie, Lidong
    Hu, Huosheng
    INTERNATIONAL JOURNAL OF MODELLING IDENTIFICATION AND CONTROL, 2013, 19 (01) : 13 - 22
  • [50] Human Action Invarianceness for Human Action Recognition
    Sjarif, Nilam Nur Amir
    Shamsuddin, Siti Mariyam
    2015 9TH INTERNATIONAL CONFERENCE ON SOFTWARE, KNOWLEDGE, INFORMATION MANAGEMENT AND APPLICATIONS (SKIMA), 2015,