Top-Down Visual Saliency via Joint CRF and Dictionary Learning

被引:103
作者
Yang, Jimei [1 ]
Yang, Ming-Hsuan [2 ]
机构
[1] Adobe Res, San Jose, CA 95110 USA
[2] Univ Calif Merced, Sch Engn, Merced, CA USA
基金
美国国家科学基金会;
关键词
Visual saliency; top-down visual saliency; fixation prediction; dictionary learning and conditional random fields; FEATURES; ATTENTION;
D O I
10.1109/TPAMI.2016.2547384
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Top-down visual saliency is an important module of visual attention. In this work, we propose a novel top-down saliency model that jointly learns a Conditional Random Field (CRF) and a visual dictionary. The proposed model incorporates a layered structure from top to bottom: CRF, sparse coding and image patches. With sparse coding as an intermediate layer, CRF is learned in a feature-adaptive manner; meanwhile with CRF as the output layer, the dictionary is learned under structured supervision. For efficient and effective joint learning, we develop a max-margin approach via a stochastic gradient descent algorithm. Experimental results on the Graz-02 and PASCAL VOC datasets show that our model performs favorably against state-of-the-art top-down saliency methods for target object localization. In addition, the dictionary update significantly improves the performance of our model. We demonstrate the merits of the proposed top-down saliency model by applying it to prioritizing object proposals for detection and predicting human fixations.
引用
收藏
页码:576 / 588
页数:13
相关论文
共 44 条
[1]   Measuring the Objectness of Image Windows [J].
Alexe, Bogdan ;
Deselaers, Thomas ;
Ferrari, Vittorio .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2189-2202
[2]  
[Anonymous], 2006, ADV NEURAL INF PROCE
[3]  
Belkin M, 2002, ADV NEUR IN, V14, P585
[4]  
Bertelli L., 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P2153, DOI 10.1109/CVPR.2011.5995597
[5]  
Bruce N., 2005, NIPS, P155, DOI DOI 10.5555/2976248.2976268
[6]  
CARREIRA J, 2010, PROC CVPR IEEE, P3241, DOI DOI 10.1109/CVPR.2010.5540063
[7]  
Carreira J, 2012, LECT NOTES COMPUT SC, V7578, P430, DOI 10.1007/978-3-642-33786-4_32
[8]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[9]   Global Contrast based Salient Region Detection [J].
Cheng, Ming-Ming ;
Zhang, Guo-Xin ;
Mitra, Niloy J. ;
Huang, Xiaolei ;
Hu, Shi-Min .
2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, :409-416
[10]   What and where A Bayesian inference theory of attention [J].
Chikkerur, Sharat ;
Serre, Thomas ;
Tan, Cheston ;
Poggio, Tomaso .
VISION RESEARCH, 2010, 50 (22) :2233-2247