Spatio-temporal Channel Correlation Networks for Action Classification

被引:128
|
作者
Diba, Ali [1 ,4 ]
Fayyaz, Mohsen [2 ]
Sharma, Vivek [3 ]
Arzani, M. Mahdi [4 ]
Yousefzadeh, Rahman [4 ]
Gall, Juergen [2 ]
Van Gool, Luc [1 ,4 ]
机构
[1] Katholieke Univ Leuven, ESAT PSI, Leuven, Belgium
[2] Univ Bonn, Bonn, Germany
[3] KIT, CV HCI, Karlsruhe, Germany
[4] Sensifai, Brussels, Belgium
来源
COMPUTER VISION - ECCV 2018, PT IV | 2018年 / 11208卷
基金
欧洲研究理事会;
关键词
RECOGNITION; HISTOGRAMS;
D O I
10.1007/978-3-030-01225-0_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that models correlations between channels of a 3D CNN with respect to temporal and spatial features. This new block can be added as a residual unit to different parts of 3D CNNs. We name our novel block 'Spatio-Temporal Channel Correlation' (STC). By embedding this block to the current state-of-the-art architectures such as ResNext and ResNet, we improve the performance by 2-3% on the Kinetics dataset. Our experiments show that adding STC blocks to current state-of-the-art architectures outperforms the state-of-the-art methods on the HMDB51, UCF101 and Kinetics datasets. The other issue in training 3D CNNs is about training them from scratch with a huge labeled dataset to get a reasonable performance. So the knowledge learned in 2D CNNs is completely ignored. Another contribution in this work is a simple and effective technique to transfer knowledge from a pre-trained 2D CNN to a randomly initialized 3D CNN for a stable weight initialization. This allows us to significantly reduce the number of training samples for 3D CNNs. Thus, by fine-tuning this network, we beat the performance of generic and recent methods in 3D CNNs, which were trained on large video datasets, e.g. Sports-1M, and fine-tuned on the target datasets, e.g. HMDB51/UCF101.
引用
收藏
页码:299 / 315
页数:17
相关论文
共 50 条
  • [31] Spatio-Temporal Querying in Smart Spaces
    Menon, Vivek
    Jayaraman, Bharat
    Govindaraju, Venu
    ANT 2012 AND MOBIWIS 2012, 2012, 10 : 366 - 373
  • [32] Action recognition using global spatio-temporal features derived from sparse representations
    Somasundaram, Guruprasad
    Cherian, Anoop
    Morellas, Vassilios
    Papanikolopoulos, Nikolaos
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 123 : 1 - 13
  • [33] Classification of motor intent in transradial amputees using sonomyography and spatio-temporal image analysis
    Hariharan, Harishwaran
    Aklaghi, Nima
    Baker, Clayton A.
    Rangwala, Huzefa
    Kosecka, Jana
    Sikdar, Siddhartha
    MEDICAL IMAGING 2016: ULTRASONIC IMAGING AND TOMOGRAPHY, 2016, 9790
  • [34] Enhanced spatio-temporal 3D CNN for facial expression classification in videos
    Khanna, Deepanshu
    Jindal, Neeru
    Rana, Prashant Singh
    Singh, Harpreet
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 9911 - 9928
  • [35] Spatio-Temporal Action Detection in Untrimmed Videos by Using Multimodal Features and Region Proposals
    Song, Yeongtaek
    Kim, Incheol
    SENSORS, 2019, 19 (05)
  • [36] Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates
    Liu, Jun
    Shahroudy, Amir
    Xu, Dong
    Kot, Alex C.
    Wang, Gang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (12) : 3007 - 3021
  • [37] Multiscale recurrence analysis of spatio-temporal data
    Riedl, M.
    Marwan, N.
    Kurths, J.
    CHAOS, 2015, 25 (12)
  • [38] Recognizing Gaits on Spatio-Temporal Feature Domain
    Kusakunniran, Worapan
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2014, 9 (09) : 1416 - 1423
  • [39] Spatio-Temporal Trajectory Models For Target Tracking
    Fanaswala, Mustafa
    Krishnamurthy, Vikram
    2014 17TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2014,
  • [40] Probabilistic spatio-temporal retrieval in smart spaces
    Menon, Vivek
    Jayaraman, Bharat
    Govindaraju, Venu
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2014, 5 (03) : 383 - 392