CSST-Net: Channel Split Spatiotemporal Network for Human Action Recognition

被引:1
|
作者
Zhou, Xuan [1 ]
Ma, Jixiang [1 ]
Yi, Jianping [2 ]
机构
[1] Xian Traff Engn Inst, Sch Mech & Elect Engn, Xian 710300, Peoples R China
[2] Xian Polytech Univ, Sch Elect & Informat, Xian 710048, Peoples R China
来源
INFORMATION TECHNOLOGY AND CONTROL | 2023年 / 52卷 / 04期
关键词
Temporal reasoning; Action recognition; Spatiotemporal representation learning; Spatiotemporal fusion;
D O I
10.5755/j01.itc.52.4.33239
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Temporal reasoning is crucial for action recognition tasks. The previous works use 3D CNNs to jointly capture spatiotemporal information, but it causes a lot of computational costs as well. To improve the above problems, we propose a general channel split spatiotemporal network (CSST-Net) to achieve effective spatiotemporal feature representation learning. The CSST module consists of the grouped spatiotemporal modeling (GSTM) module and the parameter-free feature fusion (PFFF) module. The GSTM module decomposes features into spatial and temporal parts along the channel dimension in parallel, which focuses on spatial and temporal clues, respectively. Meanwhile, we utilize the combination of group-wise convolution and point-wise convolution to reduce the number of parameters in the temporal branch, thus alleviating the overfitting of 3D CNNs. Furthermore, for the problem of spatiotemporal feature fusion, the PFFF module performs the recalibration and fusion of spatial and temporal features by a soft attention mechanism, without introducing extra parameters, thus ensuring the correct network information flow effectively. Finally, extensive experiments on three benchmark databases (Sth-Sth V1, Sth-Sth V2, and Jester) indicate that the proposed CSST-Net can achieve competitive performance compared to existing methods, and significantly reduces the number of parameters and FLOPs of 3D CNNs baseline.
引用
收藏
页码:952 / 965
页数:14
相关论文
共 50 条
  • [1] A Spatiotemporal Fusion Network For Skeleton-Based Action Recognition
    Bao, Wenxia
    Wang, Junyi
    Yang, Xianjun
    Chen, Hemu
    2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 347 - 352
  • [2] Spatiotemporal feature enhancement network for action recognition
    Huang, Guancheng
    Wang, Xiuhui
    Li, Xuesheng
    Wang, Yaru
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (19) : 57187 - 57197
  • [3] Human Action Recognition Network Based on Improved Channel Attention Mechanism
    Chen Ying
    Gong Suming
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (12) : 3538 - 3545
  • [4] A spatiotemporal and motion information extraction network for action recognition
    Wang, Wei
    Wang, Xianmin
    Zhou, Mingliang
    Wei, Xuekai
    Li, Jing
    Ren, Xiaojun
    Zong, Xuemei
    WIRELESS NETWORKS, 2024, 30 (06) : 5389 - 5405
  • [5] Multi-receptive field spatiotemporal network for action recognition
    Mu Nie
    Sen Yang
    Zhenhua Wang
    Baochang Zhang
    Huimin Lu
    Wankou Yang
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 2439 - 2453
  • [6] Multi-receptive field spatiotemporal network for action recognition
    Nie, Mu
    Yang, Sen
    Wang, Zhenhua
    Zhang, Baochang
    Lu, Huimin
    Yang, Wankou
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (07) : 2439 - 2453
  • [7] Spatiotemporal attention enhanced features fusion network for action recognition
    Danfeng Zhuang
    Min Jiang
    Jun Kong
    Tianshan Liu
    International Journal of Machine Learning and Cybernetics, 2021, 12 : 823 - 841
  • [8] Spatiotemporal attention enhanced features fusion network for action recognition
    Zhuang, Danfeng
    Jiang, Min
    Kong, Jun
    Liu, Tianshan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (03) : 823 - 841
  • [9] A Spatiotemporal Heterogeneous Two-Stream Network for Action Recognition
    Chen, Enqing
    Bai, Xue
    Gao, Lei
    Tinega, Haron Chweya
    Ding, Yingqiang
    IEEE ACCESS, 2019, 7 : 57267 - 57275
  • [10] HUMAN ACTION REPRESENTATION AND RECOGNITION: AN APPROACH TO A HISTOGRAM OF SPATIOTEMPORAL TEMPLATES
    Ahsan, Sk Md. Masudul
    Tan, Joo Kooi
    Kim, Hyoungseop
    Ishikawa, Seiji
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2015, 11 (06): : 1855 - 1867