Large-scale gesture recognition with a fusion of RGB-D data based on optical flow and the C3D model

被引:31
作者
Li, Yunan [1 ]
Miao, Qiguang [1 ]
Tian, Kuan [1 ]
Fan, Yingying [1 ]
Xu, Xin [1 ]
Ma, Zhenxin [1 ]
Song, Jianfeng [1 ]
机构
[1] Xidian Univ, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Gesture recognition; RGB-D data; Optical flow; 3D Convolutional Neural Networks;
D O I
10.1016/j.patrec.2017.12.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gesture recognition has attracted great attention owing to its applications in many fields such as Human Computer Interaction. However, in video-based gesture recognition, some gesture-irrelevant factors like the background handicap the improvement of recognition rate. In this paper, we propose an effective 3D Convolutional Neural Network based method for large-scale gesture recognition using RGB-D video data. To obtain compact but with sufficient motion path information data for the network, the inputs are unified into 32-frame videos first. Then the optical flow images are constructed from the RGB videos frame by frame, to help with eliminating the disturbing background inside them. After that, the spatiotemporal features of de-background RGB and depth data are extracted with the C3D model (a 3D CNN model) respectively and blended together in the next stage according to the discriminant correlation analysis to boost the performance. Finally the classes are predicted with a linear SVM classifier. Our proposed method achieves 54.50% accuracy on the validation subset and 60.93% on the testing subset of the Chalearn LAP IsoGD dataset, both of which outperform our results (ranked 1st place) in the Chalearn LAP Large-scale Gesture Recognition Challenge. (C) 2017 Published by Elsevier B.V.
引用
收藏
页码:187 / 194
页数:8
相关论文
共 38 条
  • [21] Gesture Recognition Model Based on 3D Accelerations
    Kong Jun-qi
    Wang Hui
    Zhang Guang-quan
    ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 66 - 70
  • [22] 3D SMoSIFT: three-dimensional sparse motion scale invariant feature transform for activity recognition from RGB-D videos
    Wan, Jun
    Ruan, Qiuqi
    Li, Wei
    An, Gaoyun
    Zhao, Ruizhen
    JOURNAL OF ELECTRONIC IMAGING, 2014, 23 (02)
  • [23] RETRACTED: Gesture recognition algorithm based on multi-scale feature fusion in RGB-D images (Retracted article. See vol. 17, pg. 301, 2023)
    Sun, Ying
    Weng, Yaoqing
    Luo, Bowen
    Li, Gongfa
    Tao, Bo
    Jiang, Du
    Chen, Disi
    IET IMAGE PROCESSING, 2020, 14 (15) : 3662 - 3668
  • [24] Graph-Based Segmentation for RGB-D Data Using 3-D Geometry Enhanced Superpixels
    Yang, Jingyu
    Gan, Ziqiao
    Li, Kun
    Hou, Chunping
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (05) : 913 - 926
  • [25] Dynamic Gesture Recognition Based on Two-Scale 3-D-ConvNeXt
    Hao, Sida
    Fu, Min
    Liu, Xuefeng
    Zheng, Bing
    IEEE SENSORS JOURNAL, 2023, 23 (23) : 29227 - 29234
  • [26] Conv3D-Based Video Violence Detection Network Using Optical Flow and RGB Data
    Park, Jae-Hyuk
    Mahmoud, Mohamed
    Kang, Hyun-Soo
    SENSORS, 2024, 24 (02)
  • [27] Depth Pooling Based Large-Scale 3-D Action Recognition With Convolutional Neural Networks
    Wang, Pichao
    Li, Wanqing
    Gao, Zhimin
    Tang, Chang
    Ogunbona, Philip O.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (05) : 1051 - 1061
  • [28] Global Registration Cumulative Error Minimization Based 3D Human Body Reconstruction Using RGB-D Scanning Data
    Sun Y.
    Miao Y.
    Bao C.
    Xia H.
    Zhang X.
    Chen J.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2019, 31 (09): : 1467 - 1476
  • [29] MultiD-CNN: A multi-dimensional feature learning approach based on deep convolutional networks for gesture recognition in RGB-D image sequences
    Elboushaki, Abdessamad
    Hannane, Rachida
    Afdel, Karim
    Koutti, Lahcen
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 139 (139)
  • [30] A JOINT EFFORT OF SPEEDED-UP ROBUST FEATURES ALGORITHM AND A DISPARITY-BASED MODEL FOR 3D INDOOR MAPPING USING RGB-D DATA
    Basso, Marcos Aurelio
    dos Santos, Daniel Rodrigues
    BOLETIM DE CIENCIAS GEODESICAS, 2018, 24 (03): : 351 - 366