Improvement on Speech Depression Recognition Based on Deep Networks

被引:0
|
作者
Li, Jinming [1 ]
Fu, Xiaoyan [2 ]
Shao, Zhuhong [3 ]
Shang, Yuanyuan [4 ]
机构
[1] Capital Normal Univ, Coll Informat Engn, Beijing, Peoples R China
[2] Capital Normal Univ, Beijing Key Lab Elect Syst Reliabil Technol, Beijing, Peoples R China
[3] Capital Normal Univ, Beijing Adv Innovat Ctr Imaging Technol, Beijing, Peoples R China
[4] Capital Normal Univ, Beijing Engn Res Ctr High Reliable Embedded Syst, Beijing, Peoples R China
来源
2018 CHINESE AUTOMATION CONGRESS (CAC) | 2018年
基金
中国国家自然科学基金;
关键词
automated depression diagnosis; speech processing; deep learning; feature extraction;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To reduce the burden of clinicians diagnosing a large number of depressive symptoms, the field of artificial intelligence researchers are increasingly interested in designing automatic recognition systems for depression. Depressed patient have different speech signal from normal people. Here, we present a deep model, Depression AudioNet, which encodes depression-related features in the vocal tract and provides a more comprehensive audio representation. Firstly, the Mel-frequency cepstral coefficients (MFCCs) were extracted from raw audio data. Secondly, the robust emotions features were acquired by Multiscale Audio Delta Normalization (MADN), which is a data processing algorithm we proposed. Finally, the MFCCs and the emotions features of two adjacent segments of local audio were fed into the Depression AudioNet in turn to train the network. This method solves the problem of less training data and low precision by increasing the length information of the sample without reducing the number of samples. Experiments are conducted on AVEC2014 dataset, and the results shows that the proposed method is more effective and accurate than the existing speech depression recognition algorithms.
引用
收藏
页码:2705 / 2709
页数:5
相关论文
共 50 条
  • [31] Accent Recognition System Using Deep Belief Networks for Telugu Speech Signals
    Mannepalli, Kasiprasad
    Sastry, Panyam Narahari
    Suman, Maloji
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS, FICTA 2016, VOL 1, 2017, 515 : 99 - 105
  • [32] Deep belief networks for phoneme recognition in continuous Tamil speech-an analysis
    Raguram, Laxmi Sree Baskaran
    Shanmugam, Vijaya Madhaya
    TRAITEMENT DU SIGNAL, 2017, 34 (3-4) : 137 - 151
  • [33] Speech emotion recognition using wavelet packet reconstruction with attention-based deep recurrent neutral networks
    Meng, Hao
    Yan, Tianhao
    Wei, Hongwei
    Ji, Xun
    BULLETIN OF THE POLISH ACADEMY OF SCIENCES-TECHNICAL SCIENCES, 2021, 69 (01)
  • [34] Using Neural Networks for a Discriminant Speech Recognition System
    Schiopu, Daniela
    Oprea, Mihaela
    2014 INTERNATIONAL CONFERENCE ON DEVELOPMENT AND APPLICATION SYSTEMS (DAS), 2014, : 165 - 169
  • [35] EMG Signals based Human Action Recognition via Deep Belief Networks
    Zhang, Jianhua
    Ling, Chen
    Li, Sunan
    IFAC PAPERSONLINE, 2019, 52 (19): : 271 - 276
  • [36] Localization Based Stereo Speech Separation Using Deep Networks
    Yu, Yang
    Wang, Wenwu
    Luo, Jian
    Feng, Pengming
    2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2015, : 153 - 157
  • [37] Speech Recognition using Deep Learning
    Lakkhanawannakun, Phoemporn
    Noyunsan, Chaluemwut
    2019 34TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2019), 2019, : 514 - 517
  • [38] Integration of Optimized Modulation Filter Sets Into Deep Neural Networks for Automatic Speech Recognition
    Moritz, Niko
    Kollmeier, Birger
    Anemueller, Joern
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2439 - 2452
  • [39] Hybrid deep learning models based emotion recognition with speech signals
    Chowdary, M. Kalpana
    Priya, E. Anu
    Danciulescu, Daniela
    Anitha, J.
    Hemanth, D. Jude
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2023, 17 (04): : 1435 - 1453
  • [40] Speech Emotion Recognition Based on Deep Learning and Kernel Nonlinear PSVM
    Han Zhiyan
    Wang Jian
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 1426 - 1430