Spatio-temporal Channel Correlation Networks for Action Classification

被引:129
|
作者
Diba, Ali [1 ,4 ]
Fayyaz, Mohsen [2 ]
Sharma, Vivek [3 ]
Arzani, M. Mahdi [4 ]
Yousefzadeh, Rahman [4 ]
Gall, Juergen [2 ]
Van Gool, Luc [1 ,4 ]
机构
[1] Katholieke Univ Leuven, ESAT PSI, Leuven, Belgium
[2] Univ Bonn, Bonn, Germany
[3] KIT, CV HCI, Karlsruhe, Germany
[4] Sensifai, Brussels, Belgium
来源
基金
欧洲研究理事会;
关键词
RECOGNITION; HISTOGRAMS;
D O I
10.1007/978-3-030-01225-0_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that models correlations between channels of a 3D CNN with respect to temporal and spatial features. This new block can be added as a residual unit to different parts of 3D CNNs. We name our novel block 'Spatio-Temporal Channel Correlation' (STC). By embedding this block to the current state-of-the-art architectures such as ResNext and ResNet, we improve the performance by 2-3% on the Kinetics dataset. Our experiments show that adding STC blocks to current state-of-the-art architectures outperforms the state-of-the-art methods on the HMDB51, UCF101 and Kinetics datasets. The other issue in training 3D CNNs is about training them from scratch with a huge labeled dataset to get a reasonable performance. So the knowledge learned in 2D CNNs is completely ignored. Another contribution in this work is a simple and effective technique to transfer knowledge from a pre-trained 2D CNN to a randomly initialized 3D CNN for a stable weight initialization. This allows us to significantly reduce the number of training samples for 3D CNNs. Thus, by fine-tuning this network, we beat the performance of generic and recent methods in 3D CNNs, which were trained on large video datasets, e.g. Sports-1M, and fine-tuned on the target datasets, e.g. HMDB51/UCF101.
引用
收藏
页码:299 / 315
页数:17
相关论文
共 50 条
  • [1] Spatio-Temporal Action Graph Networks
    Herzig, Roei
    Levi, Elad
    Xu, Huijuan
    Gao, Hang
    Brosh, Eli
    Wang, Xiaolong
    Globerson, Amir
    Darrell, Trevor
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2347 - 2356
  • [2] Interaction-Aware Spatio-Temporal Pyramid Attention Networks for Action Classification
    Hu, Weiming
    Liu, Haowei
    Du, Yang
    Yuan, Chunfeng
    Li, Bing
    Maybank, Stephen John
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 7010 - 7028
  • [3] Interaction-Aware Spatio-Temporal Pyramid Attention Networks for Action Classification
    Du, Yang
    Yuan, Chunfeng
    Li, Bing
    Zhao, Lili
    Li, Yangxi
    Hu, Weiming
    COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 388 - 404
  • [4] Spatio-Temporal Fusion Networks for Action Recognition
    Cho, Sangwoo
    Foroosh, Hassan
    COMPUTER VISION - ACCV 2018, PT I, 2019, 11361 : 347 - 364
  • [5] ActionVLAD: Learning spatio-temporal aggregation for action classification
    Girdhar, Rohit
    Ramanan, Deva
    Gupta, Abhinav
    Sivic, Josef
    Russell, Bryan
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3165 - 3174
  • [6] Spatio-temporal shape and flow correlation for action recognition
    Ke, Yan
    Sukthankar, Rahul
    Hebert, Martial
    2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 3835 - +
  • [7] Human Action Recognition Using Spatio-temporal Classification
    Fang, Chin-Hsien
    Chen, Ju-Chin
    Tseng, Chien-Chung
    Lien, Jenn-Jier James
    COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 98 - 109
  • [8] Spatio-temporal correlation networks of dengue in the state of Bahia
    Hugo Saba
    Vera C Vale
    Marcelo A Moret
    José Garcia V Miranda
    BMC Public Health, 14
  • [9] Spatio-temporal correlation networks of dengue in the state of Bahia
    Saba, Hugo
    Vale, Vera C.
    Moret, Marcelo A.
    Miranda, Jose Garcia V.
    BMC PUBLIC HEALTH, 2014, 14
  • [10] Habituation based neural networks for spatio-temporal classification
    Stiles, BW
    Ghosh, J
    NEUROCOMPUTING, 1997, 15 (3-4) : 273 - 307