Extracting hierarchical spatial and temporal features for human action recognition

被引:12
作者
Zhang, Keting [1 ]
Zhang, Liqing [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Key Lab Shanghai Educ Commiss Intelligent Interac, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Hierarchical feature extraction; Dual-channel model; Subspace network; Spatial and temporal representation; Action recognition; PARALLEL FRAMEWORK; HEVC;
D O I
10.1007/s11042-017-5179-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition is a challenging computer vision task and many efforts have been made to improve the performance. Most previous work has concentrated on the hand-crafted features or spatial-temporal features learned from multiple contiguous frames. In this paper, we present a dual-channel model to decouple the spatial and temporal feature extraction. More specifically, we propose to capture the complementary static form information from single frame and dynamic motion information from multi-frame differences in two separate channels. In both channels we use two stacked classical subspace networks to learn hierarchical representations, which are subsequently fused for action recognition. Our model is trained and evaluated on three typical benchmarks: KTH, UCF and Hollywood2 datasets. The experimental results illustrate that our approach achieves comparable performances to the state-of-the-art methods. In addition, both feature analysis and control experiments are also carried out to demonstrate the effectiveness of the proposed approach for feature extraction and thereby action recognition.
引用
收藏
页码:16053 / 16068
页数:16
相关论文
共 50 条
[31]   Joint spatial-temporal attention for action recognition [J].
Yu, Tingzhao ;
Guo, Chaoxu ;
Wang, Lingfeng ;
Gu, Huxiang ;
Xiang, Shiming ;
Pan, Chunhong .
PATTERN RECOGNITION LETTERS, 2018, 112 :226-233
[32]   Spatial-temporal pooling for action recognition in videos [J].
Wang, Jiaming ;
Shao, Zhenfeng ;
Huang, Xiao ;
Lu, Tao ;
Zhang, Ruiqian ;
Lv, Xianwei .
NEUROCOMPUTING, 2021, 451 :265-278
[33]   Spatial-temporal interaction module for action recognition [J].
Luo, Hui-Lan ;
Chen, Han ;
Cheung, Yiu-Ming ;
Yu, Yawei .
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (04)
[34]   Mining Spatial Temporal Saliency Structure for Action Recognition [J].
Liu, Yinan ;
Wu, Qingbo ;
Xu, Linfeng ;
Wu, Bo .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (10) :2643-2646
[35]   Robust Human Action Recognition Using Global Spatial-Temporal Attention for Human Skeleton Data [J].
Han, Yun ;
Chung, Sheng-Luen ;
Ambikapathi, ArulMurugan ;
Chan, Jui-Shan ;
Lin, Wei-You ;
Su, Shun-Feng .
2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
[36]   Improved SSD using deep multi-scale attention spatial–temporal features for action recognition [J].
Shuren Zhou ;
Jia Qiu ;
Arun Solanki .
Multimedia Systems, 2022, 28 :2123-2131
[37]   IMPROVING HUMAN ACTION RECOGNITION BY TEMPORAL ATTENTION [J].
Liu, Zhikang ;
Tian, Ye ;
Wang, Zilei .
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, :870-874
[38]   Learning Long-Term Temporal Features With Deep Neural Networks for Human Action Recognition [J].
Yu, Sheng ;
Xie, Li ;
Liu, Lin ;
Xia, Daoxun .
IEEE ACCESS, 2020, 8 :1840-1850
[39]   Spatio-Temporal Weighted Posture Motion Features for Human Skeleton Action Recognition Research [J].
Ding C.-Y. ;
Liu K. ;
Li G. ;
Yan L. ;
Chen B.-Y. ;
Zhong Y.-M. .
Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (01) :29-40
[40]   A hierarchical representation for human action recognition in realistic scenes [J].
Qing Lei ;
Hongbo Zhang ;
Minghai Xin ;
Yiqiao Cai .
Multimedia Tools and Applications, 2018, 77 :11403-11423