Hierarchical Attention Network for Action Segmentation

被引：4

作者：

Gammulle, Harshala ^{[1
]}

Denman, Simon ^{[1
]}

Sridharan, Sridha ^{[1
]}

Fookes, Clinton ^{[1
]}

机构：

[1] Queensland Univ Technol, SAIVT, Image & Video Res Lab, Brisbane, Qld, Australia

来源：

PATTERN RECOGNITION LETTERS | 2020年 / 131卷

关键词：

Cameras;

D O I：

10.1016/j.patrec.2020.01.023

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Temporal segmentation of events is an essential task and a precursor for the automatic recognition of human actions in the video. Several attempts have been made to capture frame-level salient aspects through attention but they lack the capacity to effectively map the temporal relationships in between the frames as they only capture a limited span of temporal dependencies. To this end we propose a complete end-to-end supervised learning approach that can better learn relationships between actions over time, thus improving the overall segmentation performance. The proposed hierarchical recurrent attention framework analyses the input video at multiple temporal scales, to form embeddings at frame level and segment level, and perform fine-grained action segmentation. This generates a simple, lightweight, yet extremely effective architecture for segmenting continuous video streams and has multiple application domains. We evaluate our system on multiple challenging public benchmark datasets, including MERL Shopping, 50 salads, and Georgia Tech Egocentric datasets and achieves state-of-the-art performance. The evaluated datasets encompass numerous video capture settings which are inclusive of static overhead camera views and dynamic, ego-centric head-mounted camera views, demonstrating the direct applicability of the proposed framework in a variety of settings. (c) 2020 Elsevier B.V. All rights reserved.

引用

页码：442 / 448

页数：7

共 40 条

[1] [Anonymous], 2014, P CVPR2014
[2] Bergstra J., 2010, P 9 PYTH SCI C, P1
[3] Chollet Francois, 2015, Keras
[4] de la Gorce M, 2008, PROC CVPR IEEE, P3192
[5] Delaitre V, 2010, BMVC 2010 21 BRIT MA, DOI DOI 10.5244/C.24.97
[6] Deep Direct Reinforcement Learning for Financial Signal Representation and Trading
Deng, Yue
Bao, Feng
Kong, Youyong
Ren, Zhiquan
Dai, Qionghai
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) : 653 - 664
[7] A Hierarchical Fused Fuzzy Deep Neural Network for Data Classification
Deng, Yue
Ren, Zhiquan
Kong, Youyong
Bao, Feng
Dai, Qionghai
[J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2017, 25 (04) : 1006 - 1012
[8] Low-Rank Structure Learning via Nonconvex Heuristic Recovery
Deng, Yue
Dai, Qionghai
Liu, Risheng
Zhang, Zengke
Hu, Sanqing
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (03) : 383 - 396
[9] Ding L., 2017, ARXIV170507818
[10] Fernando B, 2015, PROC CVPR IEEE, P5378, DOI 10.1109/CVPR.2015.7299176

← 1 2 3 4 →