Atypical Salient Regions Enhancement Network for visual saliency prediction of individuals with Autism Spectrum Disorder

被引：2

作者：

Duan, Huizhan ^{[1
]}

Liu, Zhi ^{[1
]}

Wei, Weijie ^{[2
]}

Zhang, Tianhong ^{[3
]}

Wang, Jijun ^{[3
]}

Xu, Lihua ^{[3
]}

Liu, Haichun ^{[4
]}

Chen, Tao ^{[5
,6
]}

机构：

[1] Shanghai Univ, Shanghai Inst Adv Commun & Data Sci, Sch Commun & Informat Engn, Shanghai 200444, Peoples R China

[2] Univ Amsterdam, Atlas Lab, NL-1098 XH Amsterdam, Netherlands

[3] Shanghai Jiao Tong Univ, Shanghai Mental Hlth Ctr, Sch Med, Shanghai Key Lab Psychot Disorders, Shanghai 200030, Peoples R China

[4] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China

[5] Univ Waterloo, Big Data Res Lab, Waterloo, ON, Canada

[6] Niacin Shanghai Technol Co Ltd, Shanghai, Peoples R China

来源：

SIGNAL PROCESSING-IMAGE COMMUNICATION | 2023年 / 115卷

关键词：

Atypical visual saliency; Saliency prediction; Fixation prediction; Autism spectrum disorder; Deep neural network; ENCODER-DECODER NETWORK; ATTENTION; MODEL;

D O I：

10.1016/j.image.2023.116968

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder and can result in atypical visual perception towards stimuli. Existing atypical saliency prediction models for individuals with ASD are highly inspired by typical saliency prediction models, while neglecting their special visual traits and preferences which are different from neurotypical individuals. In this paper, we propose an Atypical Salient Regions Enhancement Network (ASD-ASRENet) based on an encoder-decoder architecture to predict the atypical visual saliency of individuals with ASD. Concretely, the output of the encoder is treated as the initial prediction, then four Atypical Salient Regions Enhancement (ASRE) modules, which are specially designed for individuals with ASD, are deployed at different stages of the decoder to emphasize the atypical salient regions and further complete the prediction in a progressive manner. Besides, considering the problem that the semantic information from high levels is gradually diluted while the effect of noises contained in low levels is increasingly stronger during the top-down transmission in the decoder, we further design a Global Semantics Flow (GSF) module, which captures the global interdependencies from both spatial perspective and channel perspective, to guide each integration stage in the decoder. Extensive experiments demonstrate the effectiveness of the proposed modules and our ASD-ASRENet achieves superior performance compared with all state-of-the-art models on the Saliency4ASD benchmark.

引用

页数：11

共 76 条

[1]

Borji A, 2012, PROC CVPR IEEE, P438, DOI 10.1109/CVPR.2012.6247706

[2]

Cerf M., 2007, PREDICTING HUMAN GAZ

[3] How is Gaze Influenced by Image Transformations? Dataset and Model [J].

Che, Zhaohui ;

Borji, Ali ;

Zhai, Guangtao ;

Min, Xiongkuo ;

Guo, Guodong ;

Le Callet, Patrick .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :2287-2300

[4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[5] Reverse Attention for Salient Object Detection [J].

Chen, Shuhan ;

Tan, Xiuli ;

Wang, Ben ;

Hu, Xuelong .

COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 :236-252

[6] Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model [J].

Cornia, Marcella ;

Baraldi, Lorenzo ;

Serra, Giuseppe ;

Cucchiara, Rita .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (10) :5142-5154

[7]

Cornia M, 2016, INT C PATT RECOG, P3488, DOI 10.1109/ICPR.2016.7900174

[8]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[9] A Dataset of Eye Movements for the Children with Autism Spectrum Disorder [J].

Duan, Huiyu ;

Zhai, Guangtao ;

Min, Xiongkuo ;

Che, Zhaohui ;

Fang, Yi ;

Yang, Xiaokang .

PROCEEDINGS OF THE 10TH ACM MULTIMEDIA SYSTEMS CONFERENCE (ACM MMSYS'19), 2019, :255-260

[10]

Duan HY, 2018, IEEE IMAGE PROC, P704, DOI 10.1109/ICIP.2018.8451338

← 1 2 3 4 5 6 7 8 →