Large-scale gesture recognition with a fusion of RGB-D data based on optical flow and the C3D model

被引:31
作者
Li, Yunan [1 ]
Miao, Qiguang [1 ]
Tian, Kuan [1 ]
Fan, Yingying [1 ]
Xu, Xin [1 ]
Ma, Zhenxin [1 ]
Song, Jianfeng [1 ]
机构
[1] Xidian Univ, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Gesture recognition; RGB-D data; Optical flow; 3D Convolutional Neural Networks;
D O I
10.1016/j.patrec.2017.12.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gesture recognition has attracted great attention owing to its applications in many fields such as Human Computer Interaction. However, in video-based gesture recognition, some gesture-irrelevant factors like the background handicap the improvement of recognition rate. In this paper, we propose an effective 3D Convolutional Neural Network based method for large-scale gesture recognition using RGB-D video data. To obtain compact but with sufficient motion path information data for the network, the inputs are unified into 32-frame videos first. Then the optical flow images are constructed from the RGB videos frame by frame, to help with eliminating the disturbing background inside them. After that, the spatiotemporal features of de-background RGB and depth data are extracted with the C3D model (a 3D CNN model) respectively and blended together in the next stage according to the discriminant correlation analysis to boost the performance. Finally the classes are predicted with a linear SVM classifier. Our proposed method achieves 54.50% accuracy on the validation subset and 60.93% on the testing subset of the Chalearn LAP IsoGD dataset, both of which outperform our results (ranked 1st place) in the Chalearn LAP Large-scale Gesture Recognition Challenge. (C) 2017 Published by Elsevier B.V.
引用
收藏
页码:187 / 194
页数:8
相关论文
共 38 条
  • [31] 3D optical flow for large CT data of materials microstructures
    Nogatz, Tessa
    Redenbach, Claudia
    Schladitz, Katja
    STRAIN, 2022, 58 (03)
  • [32] View-invariant gesture recognition using 3D optical flow and harmonic motion context
    Holte, M. B.
    Moeslund, T. B.
    Fihl, P.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (12) : 1353 - 1361
  • [33] An adaptive hidden Markov model-based gesture recognition approach using Kinect to simplify large-scale video data processing for humanoid robot imitation
    Ing-Jr Ding
    Che-Wei Chang
    Multimedia Tools and Applications, 2016, 75 : 15537 - 15551
  • [34] An adaptive hidden Markov model-based gesture recognition approach using Kinect to simplify large-scale video data processing for humanoid robot imitation
    Ding, Ing-Jr
    Chang, Che-Wei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (23) : 15537 - 15551
  • [35] Multimodal Gesture Recognition for Mascot Robot System Based on Choquet Integral Using Camera and 3D Accelerometers Fusion
    Tang, Yongkang
    Vu, Hai An
    Le, Phuc Q.
    Masano, Daisuke
    Thet, Oo Han
    Fatichah, Chastine
    Liu, Zhentao
    Yamaguchi, Masashi
    Tangel, Martin Leonard
    Dong, Fangyan
    Yamazaki, Yoichi
    Hirota, Kaoru
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2011, 15 (05) : 563 - 572
  • [36] An eigenspace-based method with a user adaptation scheme for human gesture recognition by using Kinect 3D data
    Ding, Ing-Jr
    Chang, Che-Wei
    APPLIED MATHEMATICAL MODELLING, 2015, 39 (19) : 5769 - 5777
  • [37] AL-MobileNet: a novel model for 2D gesture recognition in intelligent cockpit based on multi-modal data
    Wang, Bin
    Yu, Liwen
    Zhang, Bo
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (10)
  • [38] A Continuity Equation Based Optical Flow Method for Cardiac Motion Correction in 3D PET Data
    Dawood, Mohammad
    Brune, Christoph
    Jiang, Xiaoyi
    Buether, Florian
    Burger, Martin
    Schober, Otmar
    Schaefers, Michael
    Schaefers, Klaus P.
    MEDICAL IMAGING AND AUGMENTED REALITY, 2010, 6326 : 88 - +