Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network

被引:26
作者
Zisad, Sharif Noor [1 ]
Hossain, Mohammad Shahadat [1 ]
Andersson, Karl [2 ]
机构
[1] Univ Chittagong, Dept Comp Sci & Engn, Chittagong, Bangladesh
[2] Lulea Univ Technol, Dept Comp Sci Elect & Space Engn, Skelleftea, Sweden
来源
BRAIN INFORMATICS, BI 2020 | 2020年 / 12241卷
关键词
CNN; Speech emotion; RAVDESS; MFCC; Data augmentation;
D O I
10.1007/978-3-030-59277-6_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting emotions from the speech is one of the emergent research fields in the area of human information processing. Expressing emotion is a very difficult task for a person with neurological disorder. Hence, a Speech Emotion Recognition (SER) system may solve this by ensuring a barrier-less communication. Various research has been carried out in the area of SER. Therefore, the main objective of this research is to develop a system that can recognize emotion from the speech of a neurologically disordered person. Since convolutional neural network (CNN) is an effective method, it has been considered to develop the system. The system uses tonal properties like MFCCs. RAVDESS audio speech and song databases for training and testing. In addition, a custom local dataset developed to support further training and testing. The performance of the proposed system compared with the traditional machine learning models as well as with the pre-trained CNN models including VGG16 and VGG19. The results demonstrate that the CNN model proposed in this research performed better than the mentioned machine learning techniques. This system enables one tohhhhhh classify eight emotions of neurologically disordered person including calm, angry, fearful, disgust, happy, surprise, neutral and sad.
引用
收藏
页码:287 / 296
页数:10
相关论文
共 25 条
[1]  
Ahmed T., 2019, 2019 INT C EL COMP C, P1, DOI [10.1109/ECACE.2019.8679397, DOI 10.1109/ICCIT48885.2019.9038607, DOI 10.1109/ECACE.2019.8679397]
[2]  
Alharbi S.T., 2015, P WORLD C ENG COMP S, V1
[3]  
Aloufi R., 2019, ARXIV PREPRINT ARXIV
[4]  
[Anonymous], 2014, SCAND C HLTH INF
[5]  
[Anonymous], 2017, ARXIV170108071, DOI DOI 10.1109/JSTSP.2017.2764438
[6]   Call Redistribution for a Call Center Based on Speech Emotion Recognition [J].
Bojanic, Milana ;
Delic, Vlado ;
Karpov, Alexey .
APPLIED SCIENCES-BASEL, 2020, 10 (13)
[7]   Large-Scale Machine Learning with Stochastic Gradient Descent [J].
Bottou, Leon .
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :177-186
[8]  
Chowdhury R.R., 2019, 2019 JOINT 8 INT C
[9]  
Ghai M, 2017, PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS AND COMPUTATIONAL INTELLIGENCE (ICBDAC), P34, DOI 10.1109/ICBDACI.2017.8070805
[10]  
Hossain M. S., 2019, J. Wirel. Mob. Netw. Ubiquitous Comput. Dependable Appl, V10, P37