ECNN: Activity Recognition Using Ensembled Convolutional Neural Networks

被引：0

作者：

Roy, Aprameyo ^{[1
]}

Mishra, Deepak ^{[1
]}

机构：

[1] Indian Inst Space Sci & Technol, Dept Avion, Thiruvananthapuram 695547, Kerala, India

来源：

PROCEEDINGS OF THE 2019 IEEE REGION 10 CONFERENCE (TENCON 2019): TECHNOLOGY, KNOWLEDGE, AND SOCIETY | 2019年

关键词：

Human Activity Recognition; 3D-CNN; 2D-CNN; spatio temporal; ensembling;

D O I：

10.1109/tencon.2019.8929519

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Human Activity Recognition (HAR) has been a compelling problem in the field of computer vision since a long time. Our focus is to address the problem of trimmed activity recognition which is to identify the class of human activity in a video which is temporally trimmed to contain only those periods where human activity is present. In the past few years there has been a transition from handcrafted features for classification to deep convolutional neural networks which work on raw video data to extract features and classify human activities. 3D convolutional neural networks learn features from both the temporal as well as spatial dimensions and prove to be very powerful in finding correlations in signals containing spatiotemporal information. 3D-CNNs have been extremely successful in activity recognition. We explore the shortcomings of a 3D-CNN architecture and propose ensembling with a 2D-CNN to overcome these for a significantly better performance in activity recognition.

引用

页码：757 / 760

页数：4

共 24 条

[1] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Carreira, Joao
Zisserman, Andrew
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
[2] Christoph R., 2016, Advances in neural information processing systems, P3476
[3] Dollar P., 2005, VISUAL SURVEILLANCE, V14, P65, DOI DOI 10.1109/VSPETS.2005.1570899
[4] Learning Spatiotemporal Features with 3D Convolutional Networks
Du Tran
Bourdev, Lubomir
Fergus, Rob
Torresani, Lorenzo
Paluri, Manohar
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
[5] Convolutional Two-Stream Network Fusion for Video Action Recognition
Feichtenhofer, Christoph
Pinz, Axel
Zisserman, Andrew
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941
[6] MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation
Jain, Arjun
Tompson, Jonathan
LeCun, Yann
Bregler, Christoph
[J]. COMPUTER VISION - ACCV 2014, PT II, 2015, 9004 : 302 - 315
[7] 3D Convolutional Neural Networks for Human Action Recognition
Ji, Shuiwang
Xu, Wei
Yang, Ming
Yu, Kai
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) : 221 - 231
[8] Large-scale Video Classification with Convolutional Neural Networks
Karpathy, Andrej
Toderici, George
Shetty, Sanketh
Leung, Thomas
Sukthankar, Rahul
Fei-Fei, Li
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1725 - 1732
[9] Kay W., 2017, KINETICS HUMAN ACTIO
[10] KhuiTram Soomro, 2012, BMVC 2009 BRIT MACH

← 1 2 3 →