Smart voice recognition based on deep learning for depression diagnosis

被引：0

作者：

Sukit Suparatpinyo

Nuanwan Soonthornphisaj

机构：

[1] Kasetsart University,Department of Computer Science, Faculty of Science

来源：

Artificial Life and Robotics | 2023年 / 28卷

关键词：

Deep residual network; Spectrograph; Depression; Recognition; Audio file;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Depressive disorder is a kind of mental illness with a high incidence rate due to the stress from the environment or social impact. Depression affects mood and behavior that leads to various problem domains such as education, family, and workplace problems. Suicide attempt is found in severe depression cases as well. However, depression is a treatable condition if diagnosed by psychiatrists. In Thailand, many people who aware of mental disorders do not seek help from psychiatric hospitals due to long waiting services and high fees. Therefore, we aim to create an application for users to do self-assessment by collecting their voice signal data. In our experiment, we define the voice data obtained from the depressive patient during a therapy session in a psychiatric hospital as positive class. The negative class is the voice data of non-depressive people obtained from the interview session with university students. Each audio file has been rendered into spectrograph. The spectrograph is a visual representation of power spectrum. A power spectrum is the Mel frequency-spaced cepstral coefficients (MFCCs) extracted from the human voice that changes over time using fast Fourier transform and discrete cosine transform (DCT) algorithms. Since some research claimed that DCT causes some spectral features to be loss, we do empirical studies between applied DCT and non- DCT spectrographs set. Moreover some research studies stated that larger window provides more detail of speech activity on power spectrum which affected to the performance of depressive detection, so we explore Blackman-Harris and Blackman window functions to create different set of spectrographs to prove that idea on Thai speech dataset. Deep learning models based on the deep residual network (ResNet) are explored to see its potential on classification. Different numbers of convolution layers such as ResNet-34, ResNet-50, and ResNet-101 are examined, respectively. The experimental results show that both trained ResNet-50 model from different type of spectrograph can achieve higher than 70% of F1-Score which is the best performance above other approaches. We found that the model learning from spectrograph extracted by Blackman window function with non-DCT algorithm provides the best sensitivity at 74.45% showing. To the best of our knowledge, our approach gives the highest F1-score when compared to the state of the art methods.

引用

页码：332 / 342

页数：10

共 50 条

[1] Smart voice recognition based on deep learning for depression diagnosis
Suparatpinyo, Sukit
Soonthornphisaj, Nuanwan
ARTIFICIAL LIFE AND ROBOTICS, 2023, 28 (02) : 332 - 342
[2] Voice Recognition Based on Adaptive MFCC and Deep Learning
Bae, Hyan-Soo
Lee, Ho-Jin
Lee, Suk-Gyu
PROCEEDINGS OF THE 2016 IEEE 11TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2016, : 1542 - 1546
[3] Deep Learning Based Face Recognition System with Smart Glasses
Daescu, Ovidiu
Huang, Hongyao
Weinzierl, Maxwell
12TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS (PETRA 2019), 2019, : 218 - 226
[4] Voice Gender Recognition Using Deep Learning
Buyukyilmaz, Mucahit
Cibikdiken, Ali Osman
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON MODELING, SIMULATION AND OPTIMIZATION TECHNOLOGIES AND APPLICATIONS (MSOTA2016), 2016, 58 : 409 - 411
[5] Deep Learning Based License Plate Number Recognition for Smart Cities
Vetriselvi, T.
Lydia, E. Laxmi
Mohanty, Sachi Nandan
Alabdulkreem, Eatedal
Al-Otaibi, Shaha
Al-Rasheed, Amal
Mansour, Romany F.
CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (01): : 2049 - 2064
[6] Deep learning based smart radar vision system for object recognition
Wen, Zhigang
Liu, Dan
Liu, Xiaoqing
Zhong, Ling
Lv, You
Jia, Yinglin
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2019, 10 (03) : 829 - 839
[7] Deep learning based smart radar vision system for object recognition
Zhigang Wen
Dan Liu
Xiaoqing Liu
Ling Zhong
You Lv
Yinglin Jia
Journal of Ambient Intelligence and Humanized Computing, 2019, 10 : 829 - 839
[8] A Voice-Based Emotion Recognition System Using Deep Learning Techniques
Pantoja, Carlos Guerron
Maya-Olalla, Edgar
Dominguez-Limaico, Hernan M.
Zambrano, Marcelo
Ayala, Carlos Vasquez
Pasquel, Marco Gordillo
INNOVATION AND RESEARCH-SMART TECHNOLOGIES & SYSTEMS, VOL 1, CI3 2023, 2024, 1040 : 155 - 172
[9] Torsional nystagmus recognition based on deep learning for vertigo diagnosis
Li, Haibo
Yang, Zhifan
FRONTIERS IN NEUROSCIENCE, 2023, 17
[10] Deep Learning Based Audio-Visual Emotion Recognition in a Smart Learning Environment
Ivleva, Natalja
Pentel, Avar
Dunajeva, Olga
Justsenko, Valeria
TOWARDS A HYBRID, FLEXIBLE AND SOCIALLY ENGAGED HIGHER EDUCATION, VOL 1, ICL 2023, 2024, 899 : 420 - 431

← 1 2 3 4 5 →