Robust time domain scalogram filter bank feature learning model for speech depression detection with metaheuristic spatio temporal residual BIGRU model

被引:0
作者
Jaishankar, Uma [1 ]
Nirmal, Jagannath H. [2 ]
Gidaye, Girish [3 ]
机构
[1] KJ Somaiya Coll Engn, Dept Elect, Mumbai 400077, Maharashtra, India
[2] KJ Somaiya Coll Engn, Mumbai 400077, Maharashtra, India
[3] Vidyalankar Inst Technol, Mumbai 400037, Maharashtra, India
关键词
speech depression; SD; classification; benchmark dataset; scalogram filter; performance metrics; pre-processing; NETWORK;
D O I
10.1504/IJBET.2025.145219
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Speech patterns have become a viable biometric for detecting depressive disorders, but existing methods have trouble with temporal dependencies and obtaining reliable features from speech data. To overcome these challenges, the study developed a time-domain scalogram filter bank feature-learning model. This model incorporates nonlinear transformation, increased scalogram downsampling, and time-domain filtering to improve the feature extraction process. By integrating spatial and temporal attention mechanisms and residual learning, the convolutional spatial and temporal attention-based residual Gazelle Bidirectional gated recurrent unit (BIGRU) (CSTAResGBIGRU) model is proposed. The dataset used in this study are distress analysis interview corpus/wizard-of-oz set (DAIC-WOZ) and the emotional audio-textual Corpus (EATD-Corpus). Furthermore, multiple learning curve analyses and ablation studies can be carried out to demonstrate the efficacy of the proposed model. As per the experimental outcomes, the proposed model can outperform the state-of-the-art techniques, and it can attain 99.31% and 99.5% accuracy in DAIC-WOZ and EATD-Corpus correspondingly.
引用
收藏
页码:348 / 382
页数:36
相关论文
共 29 条
[1]   Gazelle optimization algorithm: a novel nature-inspired metaheuristic optimizer [J].
Agushaka, Jeffrey O. ;
Ezugwu, Absalom E. ;
Abualigah, Laith .
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (05) :4099-4131
[2]   Taking All the Factors We Need: A Multimodal Depression Classification With Uncertainty Approximation [J].
Ahmed, Sabbir ;
Abu Yousuf, Mohammad ;
Monowar, Muhammad Mostafa ;
Hamid, Abdul ;
Alassafi, Madini O. .
IEEE ACCESS, 2023, 11 :99847-99861
[3]   Machine Learning Algorithms for Depression: Diagnosis, Insights, and Research Directions [J].
Aleem, Shumaila ;
ul Huda, Noor ;
Amin, Rashid ;
Khalid, Samina ;
Alshamrani, Sultan S. ;
Alshehri, Abdullah .
ELECTRONICS, 2022, 11 (07)
[4]   A Multimodal Framework for Depression Detection During COVID-19 via Harvesting Social Media [J].
Anshul, Ashutosh ;
Pranav, Gumpili Sai ;
Rehman, Mohammad Zia Ur ;
Kumar, Nagendra .
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (02) :2872-2888
[5]   Automated speech-based screening of depression using deep convolutional neural networks [J].
Chlasta, Karol ;
Wolk, Krzysztof ;
Krejtz, Izabela .
CENTERIS2019--INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/PROJMAN2019--INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/HCIST2019--INTERNATIONAL CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES, 2019, 164 :618-628
[6]   A deep learning model for depression detection based on MFCC and CNN generated spectrogram features [J].
Das, Arnab Kumar ;
Naskar, Ruchira .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 90
[7]   Spatial-Temporal Feature Network for Speech-Based Depression Recognition [J].
Han, Zhuojin ;
Shang, Yuanyuan ;
Shao, Zhuhong ;
Liu, Jingyi ;
Guo, Guodong ;
Liu, Tie ;
Ding, Hui ;
Hu, Qiang .
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (01) :308-318
[8]  
Huang ZC, 2020, INT CONF ACOUST SPEE, P6549, DOI [10.1109/ICASSP40776.2020.9054323, 10.1109/icassp40776.2020.9054323]
[9]   Diagnosis of Depression Based on Four-Stream Model of Bi-LSTM and CNN From Audio and Text Information [J].
Jo, A-Hyeon ;
Kwak, Keun-Chang .
IEEE ACCESS, 2022, 10 :134113-134135
[10]   Speech as a Biomarker for Depression [J].
Koops, Sanne ;
Brederoo, Sanne G. ;
de Boer, Janna N. ;
Nadema, Femke G. ;
Voppel, Alban E. ;
Sommer, Iris E. .
CNS & NEUROLOGICAL DISORDERS-DRUG TARGETS, 2023, 22 (02) :152-160