Atypical Salient Regions Enhancement Network for visual saliency prediction of individuals with Autism Spectrum Disorder

被引:2
作者
Duan, Huizhan [1 ]
Liu, Zhi [1 ]
Wei, Weijie [2 ]
Zhang, Tianhong [3 ]
Wang, Jijun [3 ]
Xu, Lihua [3 ]
Liu, Haichun [4 ]
Chen, Tao [5 ,6 ]
机构
[1] Shanghai Univ, Shanghai Inst Adv Commun & Data Sci, Sch Commun & Informat Engn, Shanghai 200444, Peoples R China
[2] Univ Amsterdam, Atlas Lab, NL-1098 XH Amsterdam, Netherlands
[3] Shanghai Jiao Tong Univ, Shanghai Mental Hlth Ctr, Sch Med, Shanghai Key Lab Psychot Disorders, Shanghai 200030, Peoples R China
[4] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
[5] Univ Waterloo, Big Data Res Lab, Waterloo, ON, Canada
[6] Niacin Shanghai Technol Co Ltd, Shanghai, Peoples R China
关键词
Atypical visual saliency; Saliency prediction; Fixation prediction; Autism spectrum disorder; Deep neural network; ENCODER-DECODER NETWORK; ATTENTION; MODEL;
D O I
10.1016/j.image.2023.116968
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder and can result in atypical visual perception towards stimuli. Existing atypical saliency prediction models for individuals with ASD are highly inspired by typical saliency prediction models, while neglecting their special visual traits and preferences which are different from neurotypical individuals. In this paper, we propose an Atypical Salient Regions Enhancement Network (ASD-ASRENet) based on an encoder-decoder architecture to predict the atypical visual saliency of individuals with ASD. Concretely, the output of the encoder is treated as the initial prediction, then four Atypical Salient Regions Enhancement (ASRE) modules, which are specially designed for individuals with ASD, are deployed at different stages of the decoder to emphasize the atypical salient regions and further complete the prediction in a progressive manner. Besides, considering the problem that the semantic information from high levels is gradually diluted while the effect of noises contained in low levels is increasingly stronger during the top-down transmission in the decoder, we further design a Global Semantics Flow (GSF) module, which captures the global interdependencies from both spatial perspective and channel perspective, to guide each integration stage in the decoder. Extensive experiments demonstrate the effectiveness of the proposed modules and our ASD-ASRENet achieves superior performance compared with all state-of-the-art models on the Saliency4ASD benchmark.
引用
收藏
页数:11
相关论文
共 76 条
[11]   Visual attention prediction for Autism Spectrum Disorder with hierarchical semantic fusion [J].
Fang, Yuming ;
Zhang, Haiyan ;
Zuo, Yifan ;
Jiang, Wenhui ;
Huang, Hanqin ;
Yan, Jiebin .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 93
[12]   VISUAL ATTENTION MODELING FOR AUTISM SPECTRUM DISORDER BY SEMANTIC FEATURES [J].
Fang, Yuming ;
Huang, Hanqin ;
Wan, Boyang ;
Zuo, Yifan .
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, :625-628
[13]   Saliency4ASD: Challenge, dataset and tools for visual attention modeling for autism spectrum disorder [J].
Gutierrez, Jesus ;
Che, Zhaohui ;
Zhai, Guangtao ;
Le Callet, Patrick .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 92 (92)
[14]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[15]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[16]   FastSal: a Computationally Efficient Network for Visual Saliency Prediction [J].
Hu, Feiyan ;
McGuinness, Kevin .
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :9054-9061
[17]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269
[18]  
Huang Q., 2017, BR MACH VIS C 2017, DOI [10.5244/c.31.18, DOI 10.5244/C.31.18]
[19]   SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks [J].
Huang, Xun ;
Shen, Chengyao ;
Boix, Xavier ;
Zhao, Qi .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :262-270
[20]   A model of saliency-based visual attention for rapid scene analysis [J].
Itti, L ;
Koch, C ;
Niebur, E .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (11) :1254-1259