Human Action Recognition Using Fusion of Modern Deep Convolutional and Recurrent Neural Networks

被引：0

作者：

Tkachenko, Dmytro ^{[1
]}

机构：

[1] Natl Tech Univ Ukraine, Igor Sikorsky Kyiv Polytech Inst, Inst Appl Syst Anal, Kiev, Ukraine

来源：

2018 IEEE FIRST INTERNATIONAL CONFERENCE ON SYSTEM ANALYSIS & INTELLIGENT COMPUTING (SAIC) | 2018年

关键词：

convolutional neural networks; human action recognition; recurrent neural networks; representation learning; video classification;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper studies the application of modern deep convolutional and recurrent neural networks to video classification, specifically human action recognition. Multi-stream architecture, which uses the ideas of representation learning to extract embeddings of multimodal features, is proposed. It is based on 2D convolutional and recurrent neural networks, and the fusion model receives a video embedding as input. Thus, the classification is performed based on this compact representation of spatial, temporal and audio information. The proposed architecture achieves 93.1% accuracy on UCF101, which is better than the results obtained with the models that have a similar architecture, and also produces representations which can be used by other models as features; anomaly detection using autoencoders is proposed as an example of this.

引用

页码：181 / 185

页数：5