Asymmetric 3D Convolutional Neural Networks for action recognition

被引:158
|
作者
Yang, Hao [1 ,3 ]
Yuan, Chunfeng [1 ]
Li, Bing [1 ]
Du, Yang [1 ,3 ]
Xing, Junliang [1 ]
Hu, Weiming [1 ,2 ,3 ]
Maybank, Stephen J. [4 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
[4] Birkbeck Coll, Dept Comp Sci & Informat Syst, London WC1E 7HX, England
关键词
Asymmetric 3D convolution; MicroNets; 3D-CNN; Action recognition; FEATURES; FLOW;
D O I
10.1016/j.patcog.2018.07.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional Neural Network based action recognition methods have achieved significant improvements in recent years. The 3D convolution extends the 2D convolution to the spatial-temporal domain for better analysis of human activities in videos. The 3D convolution, however, involves many more parameters than the 2D convolution. Thus, it is much more expensive on computation, costly on storage, and difficult to learn. This work proposes efficient asymmetric one-directional 3D convolutions to approximate the traditional 3D convolution. To improve the feature learning capacity of asymmetric 3D convolutions, a set of local 3D convolutional networks, called MicroNets, are proposed by incorporating multi-scale 3D convolution branches. Then, an asymmetric 3D-CNN deep model is constructed by MicroNets for the action recognition task. Moreover, to avoid training two networks on the RGB and Flow frames separately as most works do, a simple but effective multi-source enhanced input is proposed, which fuses useful information of the RGB and Flow frame at the pre-processing stage. The asymmetric 3D-CNN model is evaluated on two of the most challenging action recognition benchmarks, UCF-101 and HMDB-51. The asymmetric 3D-CNN model outperforms all the traditional 3D-CNN models in both effectiveness and efficiency, and its performance is comparable with that of recent state-of-the-art action recognition methods on both benchmarks. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [1] TIME-ASYMMETRIC 3D CONVOLUTIONAL NEURAL NETWORKS FOR ACTION RECOGNITION
    Wu, Chengjie
    Han, Jiayue
    Li, Xiaoqiang
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 21 - 25
  • [2] 3D Convolutional Neural Networks for Human Action Recognition
    Ji, Shuiwang
    Xu, Wei
    Yang, Ming
    Yu, Kai
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) : 221 - 231
  • [3] Action Recognition Based on Features Fusion and 3D Convolutional Neural Networks
    Liu, Lulu
    Hu, Fangyu
    Zhou, Jiahui
    PROCEEDINGS OF 2016 9TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 1, 2016, : 178 - 181
  • [4] 3D skeleton-based action recognition with convolutional neural networks
    Van-Nam Hoang
    Thi-Lan Le
    Thanh-Hai Tran
    Hai-Vu
    Van-Toi Nguyen
    2019 INTERNATIONAL CONFERENCE ON MULTIMEDIA ANALYSIS AND PATTERN RECOGNITION (MAPR), 2019,
  • [5] Basketball technique action recognition using 3D convolutional neural networks
    Wang, Jingfei
    Zuo, Liang
    Martinez, Carlos Cordente
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [6] An efficient attention module for 3d convolutional neural networks in action recognition
    Jiang, Guanghao
    Jiang, Xiaoyan
    Fang, Zhijun
    Chen, Shanshan
    APPLIED INTELLIGENCE, 2021, 51 (10) : 7043 - 7057
  • [7] SPATIOTEMPORAL PYRAMID POOLING IN 3D CONVOLUTIONAL NEURAL NETWORKS FOR ACTION RECOGNITION
    Cheng, Cheng
    Lv, Pin
    Su, Bing
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 3468 - 3472
  • [8] An efficient attention module for 3d convolutional neural networks in action recognition
    Guanghao Jiang
    Xiaoyan Jiang
    Zhijun Fang
    Shanshan Chen
    Applied Intelligence, 2021, 51 : 7043 - 7057
  • [9] 3D ACTION RECOGNITION USING DATA VISUALIZATION AND CONVOLUTIONAL NEURAL NETWORKS
    Liu, Mengyuan
    Chen, Chen
    Liu, Hong
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 925 - 930
  • [10] 3D Convolutional Neural Network for Action Recognition
    Zhang, Junhui
    Chen, Li
    Tian, Jing
    COMPUTER VISION, PT I, 2017, 771 : 600 - 607