Where Should Saliency Models Look Next?

被引:88
作者
Bylinskii, Zoya [1 ]
Recasens, Adria [1 ]
Borji, Ali [2 ]
Oliva, Aude [1 ]
Torralba, Antonio [1 ]
Durand, Fredo [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[2] Univ Cent Florida, Ctr Comp Vis Res, Orlando, FL 32816 USA
来源
COMPUTER VISION - ECCV 2016, PT V | 2016年 / 9909卷
关键词
Saliency maps; Saliency estimation; Eye movements; Deep learning; Image understanding;
D O I
10.1007/978-3-319-46454-1_49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, large breakthroughs have been observed in saliency modeling. The top scores on saliency benchmarks have become dominated by neural network models of saliency, and some evaluation scores have begun to saturate. Large jumps in performance relative to previous models can be found across datasets, image types, and evaluation metrics. Have saliency models begun to converge on human performance? In this paper, we re-examine the current state-of-the-art using a fine-grained analysis on image types, individual images, and image regions. Using experiments to gather annotations for high-density regions of human eye fixations on images in two established saliency datasets, MIT300 and CAT2000, we quantify up to 60% of the remaining errors of saliency models. We argue that to continue to approach human-level performance, saliency models will need to discover higher-level concepts in images: text, objects of gaze and action, locations of motion, and expected locations of people in images. Moreover, they will need to reason about the relative importance of image regions, such as focusing on the most important person in the room or the most informative sign on the road. More accurately tracking performance will require finer-grained evaluations and metrics. Pushing performance further will require higher-level image understanding.
引用
收藏
页码:809 / 824
页数:16
相关论文
共 35 条
[1]  
[Anonymous], 2015, PROC 28 ADV NEURAL I, DOI DOI 10.1038/SCIENTIFICAMERICAN0700-38
[2]  
[Anonymous], 2015, ARXIV151002927
[3]  
[Anonymous], Mit saliency benchmark
[4]  
[Anonymous], 2015, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2015.7298710
[5]  
Bethge M., 2014, International Conference on Learning Representations (ICLR 2015)
[6]  
Borji A., 2012, CVPR, DOI DOI 10.1109/CVPR.2012.6247706
[7]   Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study [J].
Borji, Ali ;
Sihite, Dicky N. ;
Itti, Laurent .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (01) :55-69
[8]   State-of-the-Art in Visual Attention Modeling [J].
Borji, Ali ;
Itti, Laurent .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) :185-207
[9]  
Borji Ali, 2015, ARXIV150503581
[10]  
Bruce N., 2010, Journal of Vision, V7, P950, DOI DOI 10.1167/7.9.950