Decision-Theoretic Saliency: Computational Principles, Biological Plausibility, and Implications for Neurophysiology and Psychophysics

被引:67
作者
Gao, Dashan [1 ]
Vasconcelos, Nuno [1 ]
机构
[1] Univ Calif San Diego, Stat Visual Comp Lab, La Jolla, CA 92093 USA
基金
美国国家科学基金会;
关键词
PREATTENTIVE TEXTURE-DISCRIMINATION; CLASSICAL RECEPTIVE-FIELD; VISUAL-SEARCH; SIMPLE CELLS; ATTENTION; CONTRAST; FEATURES; STATISTICS; RESPONSES; MODEL;
D O I
10.1162/neco.2009.11-06-391
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A decision-theoretic formulation of visual saliency, first proposed for top-down processing ( object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint of computational parsimony. The saliency of the visual features at a given location of the visual field is defined as the power of those features to discriminate between the stimulus at the location and a null hypothesis. For bottom-up saliency, this is the set of visual features that surround the location under consideration. Discrimination is defined in an information-theoretic sense and the optimal saliency detector derived for a class of stimuli that complies with known statistical properties of natural images. It is shown that under the assumption that saliency is driven by linear filtering, the optimal detector consists of what is usually referred to as the standard architecture of V1: a cascade of linear filtering, divisive normalization, rectification, and spatial pooling. The optimal detector is also shown to replicate the fundamental properties of the psychophysics of saliency: stimulus pop-out, saliency asymmetries for stimulus presence versus absence, disregard of feature conjunctions, and Weber's law. Finally, it is shown that the optimal saliency architecture can be applied to the solution of generic inference problems. In particular, for the class of stimuli studied, it performs the three fundamental operations of statistical inference: assessment of probabilities, implementation of Bayes decision rule, and feature selection.
引用
收藏
页码:239 / 271
页数:33
相关论文
共 87 条
[1]   SPATIOTEMPORAL ENERGY MODELS FOR THE PERCEPTION OF MOTION [J].
ADELSON, EH ;
BERGEN, JR .
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1985, 2 (02) :284-299
[2]  
Agarwal S, 2002, LECT NOTES COMPUT SC, V2353, P113
[3]   STIMULUS SPECIFIC RESPONSES FROM BEYOND THE CLASSICAL RECEPTIVE-FIELD - NEUROPHYSIOLOGICAL MECHANISMS FOR LOCAL GLOBAL COMPARISONS IN VISUAL NEURONS [J].
ALLMAN, J ;
MIEZIN, F ;
MCGUINNESS, E .
ANNUAL REVIEW OF NEUROSCIENCE, 1985, 8 :407-430
[4]  
[Anonymous], 1988, Proceedings of International Conference ofComputer Vision (ICCV'88), DOI [10.1109/CCV.1988.590008, DOI 10.1109/CCV.1988.590008]
[5]   SOME INFORMATIONAL ASPECTS OF VISUAL PERCEPTION [J].
ATTNEAVE, F .
PSYCHOLOGICAL REVIEW, 1954, 61 (03) :183-193
[6]  
Bar Hillel A, 2005, IEEE I CONF COMP VIS, P1762
[7]   Redundancy reduction revisited [J].
Barlow, H .
NETWORK-COMPUTATION IN NEURAL SYSTEMS, 2001, 12 (03) :241-253
[8]  
Barlow H.B., 1961, Sensory communication, V1, P217, DOI DOI 10.7551/MITPRESS/9780262518420.003.0013
[9]   The ''independent components'' of natural scenes are edge filters [J].
Bell, AJ ;
Sejnowski, TJ .
VISION RESEARCH, 1997, 37 (23) :3327-3338
[10]   ON THE MODELING OF DCT AND SUBBAND IMAGE DATA FOR COMPRESSION [J].
BIRNEY, KA ;
FISCHER, TR .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1995, 4 (02) :186-193