Top-Down Saliency Detection Based on Deep-Learned Features

被引:7
作者
Zhang, Duzhen [1 ,2 ]
Zakir, Ali [2 ]
机构
[1] Jiangsu Normal Univ, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Jiangsu, Peoples R China
关键词
Saliency detection; top-down; deep learning; convolutional neural networks; VISUAL SALIENCY; OBJECT TRACKING; SCENES; MODEL;
D O I
10.1142/S1469026819500093
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
How to localize objects in images accurately and efficiently is a challenging problem in computer vision. In this paper, a novel top-down fine-grained salient object detection method based on deep-learned features is proposed, which can detect the same object in input image as the query image. The query image and its three subsample images are used as top-down cues to guide saliency detection. We ameliorate convolutional neural network (CNN) using the fast VGG network (VGG-f) pre-trained on ImageNet and re-trained on the Pascal VOC 2012 dataset. Experiment on the FiFA dataset demonstrates that proposed method can localize the saliency region and find the specific object (e.g., human face) as the query. Experiments on the David1 and Face1 sequences conclusively prove that the proposed algorithm is able to effectively deal with many challenging factors including illumination change, shape deformation, scale change and partial occlusion.
引用
收藏
页数:12
相关论文
共 46 条
[1]  
[Anonymous], IEEE 12 INT C COMP V
[2]  
[Anonymous], P IEEE C COMP VIS PA
[3]  
Babenko B, 2009, PROC CVPR IEEE, P983, DOI 10.1109/CVPRW.2009.5206737
[4]   Top-down and bottom-up mechanisms in biasing competition in the human brain [J].
Beck, Diane M. ;
Kastner, Sabine .
VISION RESEARCH, 2009, 49 (10) :1154-1165
[5]  
Borji A., ARXIV14115878V1CSCV
[6]  
Borji A, 2012, PROC CVPR IEEE, P478, DOI 10.1109/CVPR.2012.6247711
[7]  
Borji A, 2012, PROC CVPR IEEE, P438, DOI 10.1109/CVPR.2012.6247706
[8]  
Cerf M., 2008, ADV NEURAL INFORM PR, V20
[9]   Faces and text attract gaze independent of the task: Experimental data and computer model [J].
Cerf, Moran ;
Frady, E. Paxon ;
Koch, Christof .
JOURNAL OF VISION, 2009, 9 (12)
[10]   The devil is in the details: an evaluation of recent feature encoding methods [J].
Chatfield, Ken ;
Lempitsky, Victor ;
Vedaldi, Andrea ;
Zisserman, Andrew .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,