A Spatiotemporal Heterogeneous Two-Stream Network for Action Recognition

被引：23

作者：

Chen, Enqing ^{[1
,2
]}

Bai, Xue ^{[1
,2
]}

Gao, Lei ^{[3
]}

Tinega, Haron Chweya ^{[1
,2
]}

Ding, Yingqiang ^{[1
,2
]}

机构：

[1] Zhengzhou Univ, Sch Informat Engn, Zhengzhou 450001, Henan, Peoples R China

[2] Zhengzhou Univ, Ind Technol Res Inst, Zhengzhou 450001, Henan, Peoples R China

[3] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON M5B 2K3, Canada

来源：

IEEE ACCESS | 2019年 / 7卷

关键词：

Action recognition; spatiotemporal heterogeneous; two-stream networks; ResNet; long-range temporal structure; training strategies;

D O I：

10.1109/ACCESS.2019.2910604

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The method based on the two-stream networks has achieved great success in video action recognition. However, most existing methods employ the same structure for both spatial and temporal networks, leading to unsatisfied performance. In this paper, we propose a spatiotemporal heterogeneous two-stream network, which employs two different network structures for spatial and temporal information, respectively. Specifically, the Residual network (ResNet) and BN-Inception are utilized as the base networks to present the spatiotemporal characteristics of different human actions. In addition, a segmental architecture is employed to model long-range temporal structure over video sequences to better distinguish the similar actions owning sub-action sharing phenomenon. Moreover, combined with the strategy of data augment, a modified cross-modal pre-training strategy is proposed and applied to the spatiotemporal heterogeneous network to improve the final performance of human actions recognition. The experiments on UCF101 and HMDB51 datasets demonstrate the proposed spatiotemporal heterogeneous two-stream network outperforms the spatiotemporal isomorphic networks and other related methods.

引用

页码：57267 / 57275

页数：9

共 45 条

[1] [Anonymous], 2014, ADV NEURAL INFORM PR
[2] [Anonymous], P 3 INT C LEARNING R
[3] [Anonymous], 2016, ACTION RECOGNITION U
[4] [Anonymous], PROC CVPR IEEE
[5] [Anonymous], PILLAR NETWORKS DIST
[6] [Anonymous], 2017, COMMUN ACM, DOI DOI 10.1145/3065386
[7] [Anonymous], TEMPORAL 3D CONVNETS
[8] [Anonymous], P IEEE C COMP VIS PA
[9] [Anonymous], 2016, P AAAI
[10] [Anonymous], 2018, P EUR C COMP VIS

← 1 2 3 4 5 →