Estimating Autism Severity in Young Children From Speech Signals Using a Deep Neural Network

被引:34
作者
Eni, Marina [1 ]
Dinstein, Ilan [2 ,3 ,4 ]
Ilan, Michal [3 ,4 ,5 ]
Menashe, Idan [4 ,6 ]
Meiri, Gal [4 ,5 ]
Zigel, Yaniv [1 ]
机构
[1] Ben Gurion Univ Negev, Dept Biomed Engn, IL-8410501 Beer Sheva, Israel
[2] Ben Gurion Univ Negev, Dept Brain & Cognit Sci, IL-8410501 Beer Sheva, Israel
[3] Ben Gurion Univ Negev, Dept Psychol, IL-8410501 Beer Sheva, Israel
[4] Ben Gurion Univ Negev, Natl Autism Res Ctr Israel, IL-8410501 Beer Sheva, Israel
[5] Soroka Univ, Presch Psychiat Unit, Med Ctr, IL-8457108 Beer Sheva, Israel
[6] Ben Gurion Univ Negev, Publ Hlth Dept, IL-8410501 Beer Sheva, Israel
基金
以色列科学基金会;
关键词
Feature extraction; Autism; Correlation; Energy states; Bandwidth; Prediction algorithms; Jitter; Audio signals; autism; autism diagnostic observation schedule; autism spectrum disorder; convolutional neural network; deep neural network; early detection; outcome measure; pitch; speech; symptom severity; treatment efficacy; zero crossing rate; SPECTRUM DISORDER; PROSODY; IDENTIFICATION; PATTERNS; FEATURES;
D O I
10.1109/ACCESS.2020.3012532
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that involves difficulties in social communication. Previous research has demonstrated that these difficulties are apparent in the way ASD children speak, indicating that it may be possible to estimate ASD severity using quantitative features of speech. Here, we extracted a variety of prosodic, acoustic, and conversational features from speech recordings of Hebrew speaking children who completed an Autism Diagnostic Observation Schedule (ADOS) assessment. Sixty features were extracted from the recordings of 72 children and 21 of the features were significantly correlated with the children's ADOS scores. Positive correlations were found with pitch variability and Zero Crossing Rate (ZCR), while negative correlations were found with the speed and number of vocal responses to the clinician, and the overall number of vocalizations. Using these features, we built several Deep Neural Network (DNN) algorithms to estimate ADOS scores and compared their performance with Linear Regression and Support Vector Regression (SVR) models. We found that a Convolutional Neural Network (CNN) yielded the best results. This algorithm predicted ADOS scores with a mean RMSE of 4.65 and a mean correlation of 0.72 with the true ADOS scores when trained and tested on different sub-samples of the available data. Automated algorithms with the ability to predict ASD severity in a reliable and sensitive manner have the potential of revolutionizing early ASD identification, quantification of symptom severity, and assessment of treatment efficacy.
引用
收藏
页码:139489 / 139500
页数:12
相关论文
共 43 条
[1]   A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental Sounds [J].
Alias, Francesc ;
Socoro, Joan Claudi ;
Sevillano, Xavier .
APPLIED SCIENCES-BASEL, 2016, 6 (05)
[2]  
[Anonymous], 2013, Diagnostics and Statistical Manual of Mental Disorders, Vfifth
[3]   On the importance of the Pearson correlation coefficient in noise reduction [J].
Benesty, Jacob ;
Chen, Jingdong ;
Huang, Yiteng .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (04) :757-765
[4]  
Boersma P., 2018, Glot International
[5]   Acoustic-Prosodic and Turn-Taking Features in Interactions with Children with Neurodevelopmental Disorders [J].
Bone, Daniel ;
Bishop, Somer ;
Gupta, Rahul ;
Lee, Sungbok ;
Narayanan, Shrikanth .
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :1185-1189
[6]   The Psychologist as an Interlocutor in Autism Spectrum Disorder Assessment: Insights From a Study of Spontaneous Prosody [J].
Bone, Daniel ;
Lee, Chi-Chun ;
Black, Matthew P. ;
Williams, Marian E. ;
Lee, Sungbok ;
Levitt, Pat ;
Narayanan, Shrikanth .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2014, 57 (04) :1162-1177
[7]   Abnormal speech spectrum and increased pitch variability in young autistic children [J].
Bonneh, Yoram S. ;
Levanon, Yoram ;
Dean-Pardo, Omrit ;
Lossos, Lan ;
Adini, Yael .
FRONTIERS IN HUMAN NEUROSCIENCE, 2011, 4
[8]   Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers [J].
Daqrouq, Khaled ;
Tutunji, Tarek A. .
APPLIED SOFT COMPUTING, 2015, 27 :231-239
[9]   The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing [J].
Eyben, Florian ;
Scherer, Klaus R. ;
Schuller, Bjoern W. ;
Sundberg, Johan ;
Andre, Elisabeth ;
Busso, Carlos ;
Devillers, Laurence Y. ;
Epps, Julien ;
Laukka, Petri ;
Narayanan, Shrikanth S. ;
Truong, Khiet P. .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2016, 7 (02) :190-202
[10]   Is voice a marker for Autism spectrum disorder? A systematic review and meta-analysis [J].
Fusaroli, Riccardo ;
Lambrechts, Anna ;
Bang, Dan ;
Bowler, Dermot M. ;
Gaigg, Sebastian B. .
AUTISM RESEARCH, 2017, 10 (03) :384-407