A Spatiotemporal Heterogeneous Two-Stream Network for Action Recognition

被引:23
|
作者
Chen, Enqing [1 ,2 ]
Bai, Xue [1 ,2 ]
Gao, Lei [3 ]
Tinega, Haron Chweya [1 ,2 ]
Ding, Yingqiang [1 ,2 ]
机构
[1] Zhengzhou Univ, Sch Informat Engn, Zhengzhou 450001, Henan, Peoples R China
[2] Zhengzhou Univ, Ind Technol Res Inst, Zhengzhou 450001, Henan, Peoples R China
[3] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON M5B 2K3, Canada
来源
IEEE ACCESS | 2019年 / 7卷
关键词
Action recognition; spatiotemporal heterogeneous; two-stream networks; ResNet; long-range temporal structure; training strategies;
D O I
10.1109/ACCESS.2019.2910604
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The method based on the two-stream networks has achieved great success in video action recognition. However, most existing methods employ the same structure for both spatial and temporal networks, leading to unsatisfied performance. In this paper, we propose a spatiotemporal heterogeneous two-stream network, which employs two different network structures for spatial and temporal information, respectively. Specifically, the Residual network (ResNet) and BN-Inception are utilized as the base networks to present the spatiotemporal characteristics of different human actions. In addition, a segmental architecture is employed to model long-range temporal structure over video sequences to better distinguish the similar actions owning sub-action sharing phenomenon. Moreover, combined with the strategy of data augment, a modified cross-modal pre-training strategy is proposed and applied to the spatiotemporal heterogeneous network to improve the final performance of human actions recognition. The experiments on UCF101 and HMDB51 datasets demonstrate the proposed spatiotemporal heterogeneous two-stream network outperforms the spatiotemporal isomorphic networks and other related methods.
引用
收藏
页码:57267 / 57275
页数:9
相关论文
共 50 条
  • [31] Spatiotemporal two-stream LSTM network for unsupervised video summarization
    Hu, Min
    Hu, Ruimin
    Wang, Zhongyuan
    Xiong, Zixiang
    Zhong, Rui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (28) : 40489 - 40510
  • [32] Spatial-temporal interaction learning based two-stream network for action recognition
    Liu, Tianyu
    Ma, Yujun
    Yang, Wenhan
    Ji, Wanting
    Wang, Ruili
    Jiang, Ping
    INFORMATION SCIENCES, 2022, 606 : 864 - 876
  • [33] YogNet: A two-stream network for realtime multiperson yoga action recognition and posture correction
    Yadav, Santosh Kumar
    Agarwal, Aayush
    Kumar, Ashish
    Tiwari, Kamlesh
    Pandey, Hari Mohan
    Akbar, Shaik Ali
    KNOWLEDGE-BASED SYSTEMS, 2022, 250
  • [34] Two-Stream Network for Sign Language Recognition and Translation
    Chen, Yutong
    Zuo, Ronglai
    Wei, Fangyun
    Wu, Yu
    Liu, Shujie
    Mak, Brian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [35] VirtualActionNet: A strong two-stream point cloud sequence network for human action recognition
    Li, Xing
    Huang, Qian
    Wang, Zhijian
    Yang, Tianjin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 89
  • [36] An Accurate Device-Free Action Recognition System Using Two-Stream Network
    Sheng, Biyun
    Fang, Yuanrun
    Xiao, Fu
    Sun, Lijuan
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (07) : 7930 - 7939
  • [37] Interactive two-stream graph neural network for skeleton-based action recognition
    Yang, Dun
    Zhou, Qing
    Wen, Ju
    JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (03)
  • [38] Action Recognition Based on Two-Stream Convolutional Networks With Long-Short-Term Spatiotemporal Features
    Wan, Yanqin
    Yu, Zujun
    Wang, Yao
    Li, Xingxin
    IEEE ACCESS, 2020, 8 (08): : 85284 - 85293
  • [39] Human Action Recognition Combining Sequential Dynamic Images and Two-Stream Convolutional Network
    Zhang Wenqiang
    Wang Zengqiang
    Zhang Liang
    LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (02)
  • [40] A Two-Stream Network For Driving Hand Gesture Recognition
    Zhou, Yefan
    Lv, Zhao
    Wang, Chaoqun
    Zhang, Shengli
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 553 - 560