Feature Fusion for Human Action Recognition based on Classical Descriptors and 3D convolutional networks

被引:0
作者
Qin, Yang [1 ]
Mo, Lingfei [1 ]
Xie, Benyi [1 ]
机构
[1] Southeast Univ, Sch Instrument Sci & Engn, Nanjing, Jiangsu, Peoples R China
来源
2017 ELEVENTH INTERNATIONAL CONFERENCE ON SENSING TECHNOLOGY (ICST) | 2017年
关键词
Human Action Recognition; Feature Fusion; 3D Convolutional Network; Classical Descriptor;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a feature fusion method that combines different kinds of classical descriptors and multi-channel 3-dimensional convolutional neural networks for the Human Action Recognition(HAR). The interrelationship between the classical descriptors and the 3D convolutional filters is explored. The spatio-temporal features are learned by the 3D convolutional networks which is trained on a large scale labeled video dataset. The classical descriptors are used as auxiliary feature to fuse a fusion feature vector with the learned features from 3D CNN. Feeding this new fusion feature vector into the SVM classifier can improve the recognition accuracy. The verification experiments are finished on different datasets. The recognition rate of the KTH dataset is 95.1% and that of the UCF101 dataset is 86.6%. The experimental results prove that this feature fusion method performs efficient and robust on the human action recognition.
引用
收藏
页码:487 / 491
页数:5
相关论文
共 35 条
  • [21] On space-time interest points
    Laptev, I
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2005, 64 (2-3) : 107 - 123
  • [22] Le Q. V., 2011, COMPUTER VISION PATT, V415, P3361
  • [23] Gradient-based learning applied to document recognition
    Lecun, Y
    Bottou, L
    Bengio, Y
    Haffner, P
    [J]. PROCEEDINGS OF THE IEEE, 1998, 86 (11) : 2278 - 2324
  • [24] Mobahi Hossein, 2009, P 26 ANN INT C MACHI, P737
  • [25] A survey of computer vision-based human motion capture
    Moeslund, TB
    Granum, E
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2001, 81 (03) : 231 - 268
  • [26] A survey of advances in vision-based human motion capture and analysis
    Moeslund, Thomas B.
    Hilton, Adrian
    Kruger, Volker
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2006, 104 (2-3) : 90 - 126
  • [27] Poppe R, 2010, IMAGE VISION COMPUT, V322
  • [28] Sadanand S, 2012, P 450 IEEE C COMP VI
  • [29] Scovanner P, P 15 INT C MULT
  • [30] Simard PY, 2003, PROC INT CONF DOC, P958