Motion Feature Network: Fixed Motion Filter for Action Recognition

被引：89

作者：

Lee, Myunggi ^{[1
,2
]}

Lee, Seungeui ^{[1
]}

Son, Sungjoon ^{[1
,2
]}

Park, Gyutae ^{[1
,2
]}

Kwak, Nojun ^{[1
]}

机构：

[1] Seoul Natl Univ, Seoul, South Korea

[2] VDO Inc, Suwon, South Korea

来源：

COMPUTER VISION - ECCV 2018, PT X | 2018年 / 11214卷

关键词：

Action recognition; Motion filter; MFNet; Spatio-temporal representation;

D O I：

10.1007/978-3-030-01249-6_24

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Spatio-temporal representations in frame sequences play an important role in the task of action recognition. Previously, a method of using optical flow as a temporal information in combination with a set of RGB images that contain spatial information has shown great performance enhancement in the action recognition tasks. However, it has an expensive computational cost and requires two-stream (RGB and optical flow) framework. In this paper, we propose MFNet (Motion Feature Network) containing motion blocks which make it possible to encode spatio-temporal information between adjacent frames in a unified network that can be trained end-to-end. The motion block can be attached to any existing CNN-based action recognition frameworks with only a small additional cost. We evaluated our network on two of the action recognition datasets (Jester and Something-Something) and achieved competitive performances for both datasets by training the networks from scratch.

引用

页码：392 / 408

页数：17

共 37 条

[1]

[Anonymous], 2017, ABS170805038 CORR

[2]

[Anonymous], 2017, arXiv

[3]

[Anonymous], 2016, CONVOLUTIONAL 2 STRE

[4]

[Anonymous], HDB BRAIN THEORY NEU

[5]

[Anonymous], 2015, FUSING MULTISTREAM D

[6]

[Anonymous], 2017, ARXIV PREPRINT ARXIV

[7]

[Anonymous], 2012, CoRR

[8] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].

Carreira, Joao ;

Zisserman, Andrew .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733

[9]

Chen M, 2011, INT CONF CLOUD COMPU, P316, DOI 10.1109/CCIS.2011.6045082

[10] Human detection using oriented histograms of flow and appearance [J].

Dalal, Navneet ;

Triggs, Bill ;

Schmid, Cordelia .

COMPUTER VISION - ECCV 2006, PT 2, PROCEEDINGS, 2006, 3952 :428-441

← 1 2 3 4 →