DTA:Double LSTM with Temporal-wise Attention Network for Action Recognition

被引：0

作者：

Xu, Yangyang ^{[1
,2
]}

Wang, Lei ^{[2
,3
]}

Cheng, Jun ^{[2
,3
]}

Xia, Haiying ^{[1
]}

Yin, Jianqin ^{[4
]}

机构：

[1] Guangxi Normal Univ, Guilin, Peoples R China

[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen Key Lab Virtual Real & Human Interact Te, Shenzhen, Peoples R China

[3] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China

[4] Beijing Univ Posts & Telecommun, Sch Automat, Beijing, Peoples R China

来源：

PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC) | 2017年

基金：

中国国家自然科学基金;

关键词：

Action Recognition; CNN; LSTM; Attention Model;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we propose a new architecture for human action recognition by using a convolution neural networks (CNN) and two Long Short-Term Memory(LSTM) networks with temporal-wise attention model. We call this network the Double LSTM with Temporal-wise Attention network (DTA). The features extracted by our model are both spatially and temporally. The attention model can learn which parts in which frames in a video are relevant to the video label and pay more attention on them. We designed a joint optimization layer (JOL) to jointly process two kinds of feature produced by two LSTMs. The proposed networks achieved improved performance on three widely used datasets-the UCF Sports dataset, the UCF11 dataset and the HMDB51 dataset.

引用

页码：1676 / 1680

页数：5

共 15 条

[1]

[Anonymous], 2014, ADV NEURAL INFORM PR

[2]

[Anonymous], 1997, Neural Computation

[3]

[Anonymous], IEEE WINT C APPL COM

[4]

[Anonymous], 2014, ADV COMPUT VIS PATTE, DOI 10.1007/978-3-319-09396-3_9

[5] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[6] Long-Term Recurrent Convolutional Networks for Visual Recognition and Description [J].

Donahue, Jeff ;

Hendricks, Lisa Anne ;

Rohrbach, Marcus ;

Venugopalan, Subhashini ;

Guadarrama, Sergio ;

Saenko, Kate ;

Darrell, Trevor .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) :677-691

[7] Convolutional Two-Stream Network Fusion for Video Action Recognition [J].

Feichtenhofer, Christoph ;

Pinz, Axel ;

Zisserman, Andrew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1933-1941

[8]

Heng Wang, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3169, DOI 10.1109/CVPR.2011.5995407

[9]

Kuehne H, 2013, HIGH PERFORMANCE COMPUTING IN SCIENCE AND ENGINEERING '12: TRANSACTIONS OF THE HIGH PERFORMANCE COMPUTING CENTER, STUTTGART (HLRS) 2012, P571, DOI 10.1007/978-3-642-33374-3_41

[10]

Ng JYH, 2015, PROC CVPR IEEE, P4694, DOI 10.1109/CVPR.2015.7299101

← 1 2 →