A Spatiotemporal Heterogeneous Two-Stream Network for Action Recognition

被引:23
|
作者
Chen, Enqing [1 ,2 ]
Bai, Xue [1 ,2 ]
Gao, Lei [3 ]
Tinega, Haron Chweya [1 ,2 ]
Ding, Yingqiang [1 ,2 ]
机构
[1] Zhengzhou Univ, Sch Informat Engn, Zhengzhou 450001, Henan, Peoples R China
[2] Zhengzhou Univ, Ind Technol Res Inst, Zhengzhou 450001, Henan, Peoples R China
[3] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON M5B 2K3, Canada
来源
IEEE ACCESS | 2019年 / 7卷
关键词
Action recognition; spatiotemporal heterogeneous; two-stream networks; ResNet; long-range temporal structure; training strategies;
D O I
10.1109/ACCESS.2019.2910604
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The method based on the two-stream networks has achieved great success in video action recognition. However, most existing methods employ the same structure for both spatial and temporal networks, leading to unsatisfied performance. In this paper, we propose a spatiotemporal heterogeneous two-stream network, which employs two different network structures for spatial and temporal information, respectively. Specifically, the Residual network (ResNet) and BN-Inception are utilized as the base networks to present the spatiotemporal characteristics of different human actions. In addition, a segmental architecture is employed to model long-range temporal structure over video sequences to better distinguish the similar actions owning sub-action sharing phenomenon. Moreover, combined with the strategy of data augment, a modified cross-modal pre-training strategy is proposed and applied to the spatiotemporal heterogeneous network to improve the final performance of human actions recognition. The experiments on UCF101 and HMDB51 datasets demonstrate the proposed spatiotemporal heterogeneous two-stream network outperforms the spatiotemporal isomorphic networks and other related methods.
引用
收藏
页码:57267 / 57275
页数:9
相关论文
共 50 条
  • [41] Human behavior recognition based on time correlation sampling two-stream heterogeneous grafting network
    Ye, Qing
    Liang, Zhenghao
    Zhong, Haoxin
    Zhang, Yongmei
    OPTIK, 2022, 251
  • [42] Early Stopping for Two-Stream Fusion Applied to Action Recognition
    Maia, Helena de Almeida
    Souza, Marcos Roberto E.
    Sousa E Santos, Anderson Carlos
    Mendoza Bobadilla, Julio Cesar
    Vieira, Marcelo Bernardes
    Pedrini, Helio
    COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VISIGRAPP 2020, 2022, 1474 : 319 - 333
  • [43] Two-stream Graph Attention Convolutional for Video Action Recognition
    Zhang, Deyuan
    Gao, Hongwei
    Dai, Hailong
    Shi, Xiangbin
    2021 IEEE 15TH INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (BIGDATASE 2021), 2021, : 23 - 27
  • [44] SALIENCY-CONTEXT TWO-STREAM CONVNETS FOR ACTION RECOGNITION
    Chen, Quan-Qi
    Liu, Feng
    Li, Xue
    Liu, Bao-Di
    Zhang, Yu-Jin
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 3076 - 3080
  • [45] A Novel Scheme for Training Two-Stream CNNs for Action Recognition
    Oves Garcia, Reinier
    Morales, Eduardo F.
    Enrique Sucar, L.
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 : 729 - 739
  • [46] Human Action Recognition Fusing Two-Stream Networks and SVM
    Tong A.
    Tang C.
    Wang W.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (09): : 863 - 870
  • [47] Two-stream spatiotemporal image fusion network based on difference transformation
    Fang, Shuai
    Meng, Siyuan
    Zhang, Jing
    Cao, Yang
    JOURNAL OF APPLIED REMOTE SENSING, 2022, 16 (03)
  • [48] An Improved Two-stream 3D Convolutional Neural Network for Human Action Recognition
    Chen, Jun
    Xu, Yuanping
    Zhang, Chaolong
    Xu, Zhijie
    Meng, Xiangxiang
    Wang, Jie
    2019 25TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC), 2019, : 135 - 140
  • [49] Improved human action recognition approach based on two-stream convolutional neural network model
    Congcong Liu
    Jie Ying
    Haima Yang
    Xing Hu
    Jin Liu
    The Visual Computer, 2021, 37 : 1327 - 1341
  • [50] The Very Deep Multi-stage Two-stream Convolutional Neural Network for Action Recognition
    Gao, Xiuju
    Zhang, Hanling
    PROCEEDINGS OF THE 2016 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INFORMATION TECHNOLOGY (ICMIT), 2016, 49 : 265 - 269