Saliency prediction based on multi-channel models of visual processing

被引:4
作者
Li, Qiang [1 ,2 ]
机构
[1] Univ Valencia, Image Proc Lab, Valencia, Spain
[2] Triinst Ctr Translat Res Neuroimaging & Data Sci T, Atlanta, GA 30303 USA
关键词
Visual attention; Redundancy; Multi-channel model; Opponent color channel; Wavelet energy map; Contrast sensitivity function; Saliency prediction; COLOR; INTEGRATION; ATTENTION; NETWORK;
D O I
10.1007/s00138-023-01405-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual attention is one of the most significant characteristics for selecting and understanding the outside redundancy world. The human vision system cannot process all information simultaneously due to the visual information bottleneck. In order to reduce the redundant input of visual information, the human visual system mainly focuses on dominant parts of scenes. This is commonly known as visual saliency map prediction. This paper proposed a new psychophysical oriented saliency prediction architecture, which inspired by multi-channel model of visual cortex functioning in humans. The model consists of opponent color channels, wavelet transform, wavelet energy map, and contrast sensitivity function for extracting low-level image features and providing a maximum approximation to the low-level human visual system. The proposed model is evaluated using several datasets, including the MIT1003, MIT300, TORONTO, SID4VAM, and UCF Sports datasets. We also quantitatively and qualitatively compare the saliency prediction performance with that of other state-of-the-art models. Our model achieved strongly stable and better performance with different metrics on natural images, psychophysical synthetic images and dynamic videos. Additionally, we suggested that Fourier and spectral-inspired saliency prediction models outperformed other state-of-the-art non-neural network and even deep neural network models on psychophysical synthetic images. In the meantime, we suggest that deep neural networks need specific architectures and goals to be able to predict salient performance on psychophysical synthetic images better and more reliably. Finally, the proposed model could be used as a computational model of primate low-level vision system and help us understand mechanism of primate low-level vision system.
引用
收藏
页数:19
相关论文
共 62 条
  • [1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
  • [2] [Anonymous], 2000, THESIS PASADENA CALI
  • [3] [Anonymous], 1959, NPL Symposium on the Mechanization of Thought Process
  • [4] [Anonymous], 2008, NIPS
  • [5] [Anonymous], 2012, MIT TECHNICAL REPORT
  • [6] [Anonymous], 1997, Wavelets: Theory and Applications
  • [7] SID4VAM: A Benchmark Dataset with Synthetic Images for Visual Attention Modeling
    Berga, David
    Fdez-Vidal, Xose R.
    Otazu, Xavier
    Pardo, Xose M.
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8788 - 8797
  • [8] Saliency Prediction in the Deep Learning Era: Successes and Limitations
    Borji, Ali
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 679 - 700
  • [9] Analysis of scores, datasets, and models in visual saliency prediction
    Borji, Ali
    Tavakoli, Hamed R.
    Sihite, Dicky N.
    Itti, Laurent
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 921 - 928
  • [10] Deep problems with neural network models of human vision
    Bowers, Jeffrey S.
    Malhotra, Gaurav
    Dujmovic, Marin
    Llera Montero, Milton
    Tsvetkov, Christian
    Biscione, Valerio
    Puebla, Guillermo
    Adolfi, Federico
    Hummel, John E.
    Heaton, Rachel F.
    Evans, Benjamin D.
    Mitchell, Jeffrey
    Blything, Ryan
    [J]. BEHAVIORAL AND BRAIN SCIENCES, 2022, 46