Neural networks based visual attention model for surveillance videos

被引：9

作者：

Guraya, Fahad Fazal Elahi ^{[1
]}

Cheikh, Faouzi Alaya ^{[1
]}

机构：

[1] Gjovik Univ Coll, Fac Comp Sci & Media Technol, N-2802 Gjovik, Norway

来源：

NEUROCOMPUTING | 2015年 / 149卷

关键词：

Visual salience; Video; Surveillance; Neural network; Attention model; HVS; EYE-MOVEMENTS; SALIENCY; SEARCH; TASK;

D O I：

10.1016/j.neucom.2014.08.062

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we propose a novel Computational Attention Models (CAM) that fuses bottom-up, top-down and salient motion visual cues to compute visual salience in surveillance videos. When dealing with a number of visual features/cues in a system, it is always challenging to combine or fuse them. As there is no commonly agreed natural way of combining different conspicuity maps obtained from different features: face and motion for example, the challenge is thus to find the right mix of visual cues to get a salience map that is the closest to a corresponding gaze map? In the literature many CAMs have used fixed weights for combining different visual cues. This is computationally attractive but is a very crude way of combining the different cues. Furthermore, the weights are typically set in an ad hoc fashion. Therefore in this paper, we propose a machine learning approach, using an Artificial Neural Network (ANN) to estimate these weights. The ANN is trained using gaze maps, obtained by eye tracking in psycho-physical experiments. These weights are then used to combine the conspicuities of the different visual cues in our CAM, which is later applied to surveillance videos. The proposed model is designed in a way to consider important visual cues typically present in surveillance videos, and to combine their conspicuities via ANN. The obtained results are encouraging and show a clear improvement over state-of-the-art CAMs. (c) 2014 Elsevier B.V. All rights reserved.

引用

页码：1348 / 1359

页数：12

共 51 条

[1] Amirshahi SA, 2011, INT WORK QUAL MULTIM, P84, DOI 10.1109/QoMEX.2011.6065718
[2] [Anonymous], 2007, Advances in Neural Information Processing Systems
[3] [Anonymous], TECHNICAL REPORT
[4] [Anonymous], 1974, Ph.D Thesis
[5] Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study
Borji, Ali
Sihite, Dicky N.
Itti, Laurent
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (01) : 55 - 69
[6] State-of-the-Art in Visual Attention Modeling
Borji, Ali
Itti, Laurent
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) : 185 - 207
[7] Bryson AE., 1969, Applied optimal control: optimization, estimation, and control, Blaisdell book in the pure and applied sciences
[8] Faces and text attract gaze independent of the task: Experimental data and computer model
Cerf, Moran
Frady, E. Paxon
Koch, Christof
[J]. JOURNAL OF VISION, 2009, 9 (12):
[9] DESIMONE R, 1984, J NEUROSCI, V4, P2051
[10] Fraundorfer F., 2003, P INT WORKSH ATT PER, P17

← 1 2 3 4 5 6 →