Towards Improved Human Action Recognition Using Convolutional Neural Networks and Multimodal Fusion of Depth and Inertial Sensor Data

被引:20
作者
Ahmad, Zeeshan [1 ]
Khan, Naimul Mefraz [1 ]
机构
[1] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON, Canada
来源
2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018) | 2018年
关键词
Convolutional neural network; data augmentation; multimodal fusion;
D O I
10.1109/ISM.2018.000-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper attempts at improving the accuracy of Human Action Recognition (HAR) by fusion of depth and inertial sensor data. Firstly, we transform the depth data into Sequential Front view Images(SFI) and fine-tune the pre-trained AlexNet on these images. Then, inertial data is converted into Signal Images (SI) and another convolutional neural network (CNN) is trained on these images. Finally, learned features are extracted from both CNN, fused together to make a shared feature layer, and these features are fed to the classifier. We experiment with two classifiers, namely Support Vector Machines (SVM) and softmax classifier and compare their performances. The recognition accuracies of each modality, depth data alone and sensor data alone are also calculated and compared with fusion based accuracies to highlight the fact that fusion of modalities yields better results than individual modalities. Experimental results on UTD-MHAD and Kinect 2D datasets show that proposed method achieves state of the art results when compared to other recently proposed visual-inertial action recognition methods.
引用
收藏
页码:223 / 230
页数:8
相关论文
共 41 条
[1]   A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data [J].
Agarap, Abien Fred M. .
PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, :26-30
[2]  
[Anonymous], ISWC
[3]  
[Anonymous], 2013, arXiv
[4]  
[Anonymous], ANALOG INTEGRATED CI
[5]  
[Anonymous], 2017, 2017 2 INT C BIOENG
[6]  
[Anonymous], 2012, P ACM INT C MULT NAR, DOI DOI 10.1145/2393347.2396382
[7]   Activity recognition from user-annotated acceleration data [J].
Bao, L ;
Intille, SS .
PERVASIVE COMPUTING, PROCEEDINGS, 2004, 3001 :1-17
[8]   The recognition of human movement using temporal templates [J].
Bobick, AF ;
Davis, JW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) :257-267
[9]   DMMs-Based Multiple Features Fusion for Human Action Recognition [J].
Bulbul, Mohammad Farhad ;
Jiang, Yunsheng ;
Ma, Jinwen .
INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2015, 6 (04) :23-39
[10]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167