Asymmetric 3D Convolutional Neural Networks for action recognition

被引：158

作者：

Yang, Hao ^{[1
,3
]}

Yuan, Chunfeng ^{[1
]}

Li, Bing ^{[1
]}

Du, Yang ^{[1
,3
]}

Xing, Junliang ^{[1
]}

Hu, Weiming ^{[1
,2
,3
]}

Maybank, Stephen J. ^{[4
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China

[2] Chinese Acad Sci, CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Beijing 100190, Peoples R China

[4] Birkbeck Coll, Dept Comp Sci & Informat Syst, London WC1E 7HX, England

来源：

PATTERN RECOGNITION | 2019年 / 85卷

关键词：

Asymmetric 3D convolution; MicroNets; 3D-CNN; Action recognition; FEATURES; FLOW;

D O I：

10.1016/j.patcog.2018.07.028

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional Neural Network based action recognition methods have achieved significant improvements in recent years. The 3D convolution extends the 2D convolution to the spatial-temporal domain for better analysis of human activities in videos. The 3D convolution, however, involves many more parameters than the 2D convolution. Thus, it is much more expensive on computation, costly on storage, and difficult to learn. This work proposes efficient asymmetric one-directional 3D convolutions to approximate the traditional 3D convolution. To improve the feature learning capacity of asymmetric 3D convolutions, a set of local 3D convolutional networks, called MicroNets, are proposed by incorporating multi-scale 3D convolution branches. Then, an asymmetric 3D-CNN deep model is constructed by MicroNets for the action recognition task. Moreover, to avoid training two networks on the RGB and Flow frames separately as most works do, a simple but effective multi-source enhanced input is proposed, which fuses useful information of the RGB and Flow frame at the pre-processing stage. The asymmetric 3D-CNN model is evaluated on two of the most challenging action recognition benchmarks, UCF-101 and HMDB-51. The asymmetric 3D-CNN model outperforms all the traditional 3D-CNN models in both effectiveness and efficiency, and its performance is comparable with that of recent state-of-the-art action recognition methods on both benchmarks. (C) 2018 Elsevier Ltd. All rights reserved.

引用

页码：1 / 12

页数：12

共 50 条

[21] 3D Convolutional Neural Networks for Dynamic Sign Language Recognition
Liang, Zhi-Jie
Liao, Sheng-Bin
Hu, Bing-Zhang
COMPUTER JOURNAL, 2018, 61 (11): : 1724 - 1736
[22] 3D Convolutional Neural Networks for Soccer Object Motion Recognition
Lee, Jiwon
Kim, Yoonhyung
Jeong, Minki
Kim, Changick
Nam, Do-Won
Lee, JungSoo
Moon, Sungwon
Yoo, WonYoung
2018 20TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT), 2018, : 354 - 358
[23] Sign Language Recognition Based on 3D Convolutional Neural Networks
Ramos Neto, Geovane M.
Braz Junior, Geraldo
Sousa de Almeida, Joao Dallyson
de Paiva, Anselmo Cardoso
IMAGE ANALYSIS AND RECOGNITION (ICIAR 2018), 2018, 10882 : 399 - 407
[24] SIGN LANGUAGE RECOGNITION USING 3D CONVOLUTIONAL NEURAL NETWORKS
Huang, Jie
Zhou, Wengang
Li, Houqiang
Li, Weiping
2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
[25] Action Recognition by 3D Convolutional Network
Brezovsky, Matus
Sopiak, Dominik
Oravec, Milos
PROCEEDINGS OF ELMAR-2018: 60TH INTERNATIONAL SYMPOSIUM ELMAR-2018, 2018, : 71 - 74
[26] SKELETON-BASED HUMAN ACTION RECOGNITION USING SPATIAL TEMPORAL 3D CONVOLUTIONAL NEURAL NETWORKS
Tu, Juanhui
Liu, Mengyuan
Liu, Hong
2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
[27] Fixed-Point Quantization of 3D Convolutional Neural Networks for Energy-Efficient Action Recognition
Lee, Hyunhoon
Byun, Younghoon
Hwang, Seokha
Lee, Sunggu
Lee, Youngjoo
2018 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2018, : 129 - 130
[28] Active Tactile Recognition of Deformable Objects with 3D Convolutional Neural Networks
Gandarias, Juan M.
Pastor, Francisco
Garcia-Cerezo, Alfonso J.
Gomez-de-Gabriel, Jesus M.
2019 IEEE WORLD HAPTICS CONFERENCE (WHC), 2019, : 551 - 552
[29] CACNN: Capsule Attention Convolutional Neural Networks for 3D Object Recognition
Sun, Kai
Zhang, Jiangshe
Xu, Shuang
Zhao, Zixiang
Zhang, Chunxia
Liu, Junmin
Hu, Junying
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 12
[30] Point Cloud Object Recognition using 3D Convolutional Neural Networks
Soares, Marcelo Borghetti
Wermter, Stefan
2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,

← 1 2 3 4 5 →