CSST-Net: Channel Split Spatiotemporal Network for Human Action Recognition

被引：1

作者：

Zhou, Xuan ^{[1
]}

Ma, Jixiang ^{[1
]}

Yi, Jianping ^{[2
]}

机构：

[1] Xian Traff Engn Inst, Sch Mech & Elect Engn, Xian 710300, Peoples R China

[2] Xian Polytech Univ, Sch Elect & Informat, Xian 710048, Peoples R China

来源：

INFORMATION TECHNOLOGY AND CONTROL | 2023年 / 52卷 / 04期

关键词：

Temporal reasoning; Action recognition; Spatiotemporal representation learning; Spatiotemporal fusion;

D O I：

10.5755/j01.itc.52.4.33239

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Temporal reasoning is crucial for action recognition tasks. The previous works use 3D CNNs to jointly capture spatiotemporal information, but it causes a lot of computational costs as well. To improve the above problems, we propose a general channel split spatiotemporal network (CSST-Net) to achieve effective spatiotemporal feature representation learning. The CSST module consists of the grouped spatiotemporal modeling (GSTM) module and the parameter-free feature fusion (PFFF) module. The GSTM module decomposes features into spatial and temporal parts along the channel dimension in parallel, which focuses on spatial and temporal clues, respectively. Meanwhile, we utilize the combination of group-wise convolution and point-wise convolution to reduce the number of parameters in the temporal branch, thus alleviating the overfitting of 3D CNNs. Furthermore, for the problem of spatiotemporal feature fusion, the PFFF module performs the recalibration and fusion of spatial and temporal features by a soft attention mechanism, without introducing extra parameters, thus ensuring the correct network information flow effectively. Finally, extensive experiments on three benchmark databases (Sth-Sth V1, Sth-Sth V2, and Jester) indicate that the proposed CSST-Net can achieve competitive performance compared to existing methods, and significantly reduces the number of parameters and FLOPs of 3D CNNs baseline.

引用

页码：952 / 965

页数：14

共 50 条

[31] ACA-Net: adaptive context-aware network for basketball action recognition
Zhang, Yaolei
Zhang, Fei
Zhou, Yuanli
Xu, Xiao
FRONTIERS IN NEUROROBOTICS, 2024, 18
[32] CLS-Net: An Action Recognition Algorithm Based on Channel-Temporal Information Modeling
Xue, Mengfan
Zheng, Jiannan
Li, Tao
Peng, Dongliang
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (08)
[33] 3D network with channel excitation and knowledge distillation for action recognition
Hu, Zhengping
Mao, Jianzeng
Yao, Jianxin
Bi, Shuai
FRONTIERS IN NEUROROBOTICS, 2023, 17
[34] Spatial-temporal channel-wise attention network for action recognition
Chen, Lin
Liu, Yungang
Man, Yongchao
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (14) : 21789 - 21808
[35] Spatial-temporal channel-wise attention network for action recognition
Lin Chen
Yungang Liu
Yongchao Man
Multimedia Tools and Applications, 2021, 80 : 21789 - 21808
[36] Spatiotemporal Progressive Inward-Outward Aggregation Network for skeleton-based action recognition
Yin, Xinpeng
Zhong, Jianqi
Lian, Deliang
Cao, Wenming
PATTERN RECOGNITION, 2024, 150
[37] Human Tumble Action Recognition Using Spiking Neuron Network
Li, Yu
Wang, Ke
Huang, MinFeng
Li, RuiFeng
Gao, TianZe
Wu, Jun
PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5309 - 5313
[38] A Channel-Wise Spatial-Temporal Aggregation Network for Action Recognition
Wang, Huafeng
Xia, Tao
Li, Hanlin
Gu, Xianfeng
Lv, Weifeng
Wang, Yuehai
MATHEMATICS, 2021, 9 (24)
[39] LAGA-Net: Local-and-Global Attention Network for Skeleton Based Action Recognition
Xia, Rongjie
Li, Yanshan
Luo, Wenhan
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2648 - 2661
[40] Motion-Guided Graph Convolutional Network for Human Action Recognition
Li, Jingjing
Huang, Zhangjin
Zou, Lu
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (07): : 1077 - 1086

← 1 2 3 4 5 →