Spatio-temporal Channel Correlation Networks for Action Classification

被引:129
|
作者
Diba, Ali [1 ,4 ]
Fayyaz, Mohsen [2 ]
Sharma, Vivek [3 ]
Arzani, M. Mahdi [4 ]
Yousefzadeh, Rahman [4 ]
Gall, Juergen [2 ]
Van Gool, Luc [1 ,4 ]
机构
[1] Katholieke Univ Leuven, ESAT PSI, Leuven, Belgium
[2] Univ Bonn, Bonn, Germany
[3] KIT, CV HCI, Karlsruhe, Germany
[4] Sensifai, Brussels, Belgium
来源
基金
欧洲研究理事会;
关键词
RECOGNITION; HISTOGRAMS;
D O I
10.1007/978-3-030-01225-0_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that models correlations between channels of a 3D CNN with respect to temporal and spatial features. This new block can be added as a residual unit to different parts of 3D CNNs. We name our novel block 'Spatio-Temporal Channel Correlation' (STC). By embedding this block to the current state-of-the-art architectures such as ResNext and ResNet, we improve the performance by 2-3% on the Kinetics dataset. Our experiments show that adding STC blocks to current state-of-the-art architectures outperforms the state-of-the-art methods on the HMDB51, UCF101 and Kinetics datasets. The other issue in training 3D CNNs is about training them from scratch with a huge labeled dataset to get a reasonable performance. So the knowledge learned in 2D CNNs is completely ignored. Another contribution in this work is a simple and effective technique to transfer knowledge from a pre-trained 2D CNN to a randomly initialized 3D CNN for a stable weight initialization. This allows us to significantly reduce the number of training samples for 3D CNNs. Thus, by fine-tuning this network, we beat the performance of generic and recent methods in 3D CNNs, which were trained on large video datasets, e.g. Sports-1M, and fine-tuned on the target datasets, e.g. HMDB51/UCF101.
引用
收藏
页码:299 / 315
页数:17
相关论文
共 50 条
  • [21] SPATIO-TEMPORAL CO-OCCURRENCE CHARACTERIZATIONS FOR HUMAN ACTION CLASSIFICATION
    Sabri, Aznul Qalid Md
    Boonaert, Jacques
    Abdullah, Erma Rahayu Mohd Faizal
    Mansoor, Ali Mohammed
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2017, 30 (03) : 154 - 173
  • [22] IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION
    Mesmakhosroshahi, Maral
    Kim, Joohee
    2012 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2012,
  • [23] Video action re-localization using spatio-temporal correlation
    Ramaswamy, Akshaya
    Seemakurthy, Karthik
    Gubbi, Jayayardhana
    Balamuralidhar, P.
    2022 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2022), 2022, : 192 - 201
  • [24] Action MACH - A spatio-temporal maximum average correlation height filter for action recognition
    Rodriguez, Mikel D.
    Ahmed, Javed
    Shah, Mubarak
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 3001 - +
  • [25] An efficient spatio-temporal index for spatio-temporal query in wireless sensor networks
    Lee, Donhee
    Yoon, Kyoungro
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2017, 11 (10): : 4888 - 4908
  • [26] Estimating and modeling spatio-temporal correlation structures for river monitoring networks
    L. Clement
    O. Thas
    Journal of Agricultural, Biological, and Environmental Statistics, 2007, 12
  • [27] Estimating and modeling spatio-temporal correlation structures for river monitoring networks
    Clement, L.
    Thas, O.
    JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2007, 12 (02) : 161 - 176
  • [28] Human Action Recognition using Factorized Spatio-Temporal Convolutional Networks
    Sun, Lin
    Jia, Kui
    Yeung, Dit-Yan
    Shi, Bertram E.
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4597 - 4605
  • [29] Exploring hybrid spatio-temporal convolutional networks for human action recognition
    Wang, Hao
    Yang, Yanhua
    Yang, Erkun
    Deng, Cheng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (13) : 15065 - 15081
  • [30] EXPLOITING STRUCTURE OF SPATIO-TEMPORAL CORRELATION FOR DETECTION IN WIRELESS SENSOR NETWORKS
    Ali, Sadiq
    Lopez-Salcedo, Jose A.
    Seco-Granados, Gonzalo
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 774 - 778