Spatio-Temporal Features based Human Action Recognition using Convolutional Long Short-Term Deep Neural Network

被引:0
作者
Saif, A. F. M. Saifuddin [1 ]
Wollega, Ebisa D. [1 ]
Kalevela, Sylvester A. [1 ]
机构
[1] Colorado State Univ, Sch Engn, Pueblo, CO 81001 USA
关键词
Convolutional neural network; recurrent neural network; long short-term memory; human action recognition; CNN;
D O I
10.14569/IJACSA.2023.0140501
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recognition of human intention is crucial and challenging due to subtle motion patterns of a series of action evolutions. Understanding of human actions is the foundation of many applications, i.e., human robot interaction, smart video monitoring and autonomous driving etc. Existing deep learning methods use either spatial or temporal features during training. This research focuses on developing a lightweight method using both spatial and temporal features to predict human intention correctly. This research proposes Convolutional Long Short-Term Deep Network (CLSTDN) consists of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). CNN uses Inception-ResNet-v2 to classify object specific class categories by extracting spatial features and RNN uses Long Short-Term Memory (LSTM) for final prediction based on temporal features. Proposed method was validated on four challenging benchmark dataset, i.e., UCF Sports, UCF-11, KTH and UCF-50. Performance of the proposed method was evaluated using seven performance metrics, i.e., accuracy, precision, recall, f-measure, error rate, loss and confusion matrix. Proposed method showed better results comparing with existing research results. Proposed method is expected to encourage researchers to use in future for real time implications to predict human intentions more robustly.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 68 条
[1]   Action recognition based on binary patterns of action-history and histogram of oriented gradient [J].
Ahad, Md. Atiqur Rahman ;
Islam, Md. Nazmul ;
Jahan, Israt .
JOURNAL ON MULTIMODAL USER INTERFACES, 2016, 10 (04) :335-344
[2]   Transfer Deep Learning Along With Binary Support Vector Machine for Abnormal Behavior Detection [J].
Al-Dhamari, Ahlam ;
Sudirman, Rubita ;
Mahmood, Nasrul Humaimi .
IEEE ACCESS, 2020, 8 :61085-61095
[3]   A State-of-the-Art Survey on Deep Learning Theory and Architectures [J].
Alom, Md Zahangir ;
Taha, Tarek M. ;
Yakopcic, Chris ;
Westberg, Stefan ;
Sidike, Paheding ;
Nasrin, Mst Shamima ;
Hasan, Mahmudul ;
Van Essen, Brian C. ;
Awwal, Abdul A. S. ;
Asari, Vijayan K. .
ELECTRONICS, 2019, 8 (03)
[4]   A Sensor Network Approach for Violence Detection in Smart Cities Using Deep Learning [J].
Baba, Marius ;
Gui, Vasile ;
Cernazanu, Cosmin ;
Pescaru, Dan .
SENSORS, 2019, 19 (07)
[5]  
Bara C.P., 2020, P AAAI 20 WORKSH AFF, P67
[6]  
Condes Ignacio, 2019, Advances in Physical Agents. Proceedings of the 19th International Workshop of Physical Agents (WAF 2018). Advances in Intelligent Systems and Computing (AISC 855), P147, DOI 10.1007/978-3-319-99885-5_11
[7]  
Diraco G., 2019, P AI AAL AI IA, P38
[8]   Gesture Recognition Based on CNN and DCGAN for Calculation and Text Output [J].
Fang, Wei ;
Ding, Yewen ;
Zhang, Feihong ;
Sheng, Jack .
IEEE ACCESS, 2019, 7 :28230-28237
[9]   Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation [J].
Fang, Zhijie ;
Lopez, Antonio M. .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (11) :4773-4783
[10]   Convolutional Two-Stream Network Fusion for Video Action Recognition [J].
Feichtenhofer, Christoph ;
Pinz, Axel ;
Zisserman, Andrew .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1933-1941