VARIABILITY COMPENSATION IN SMALL DATA: OVERSAMPLED EXTRACTION OF I-VECTORS FOR THE CLASSIFICATION OF DEPRESSED SPEECH

被引:0
|
作者
Cummins, Nicholas [1 ,2 ]
Epps, Julien [1 ,2 ]
Sethu, Vidhyasaharan [1 ]
Krajewski, Jarek [3 ]
机构
[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW, Australia
[2] Natl ICT Australia, ATP Res Lab, Sydney, NSW, Australia
[3] Univ Wuppertal, Expt Ind Psychol, Wuppertal, Germany
来源
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年
基金
澳大利亚研究理事会;
关键词
Depression; Acoustic Variability; I-vectors; Linear Discriminant Analysis; Within Class Covariance Normalisation; t-Distributed Stochastic Neighbour Embedding; RECOGNITION; SEVERITY;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Variations in the acoustic space due to changes in speaker mental state are potentially overshadowed by variability due to speaker identity and phonetic content. Using the Audio/Visual Emotion Challenge and Workshop 2013 Depression Dataset we explore the suitability of i-vectors for reducing these latter sources of variability for distinguishing between low or high levels of speaker depression. In addition we investigate whether supervised variability compensation methods such as Linear Discriminant Analysis (LDA), and Within Class Covariance Normalisation (WCCN), applied in the i-vector domain, could be used to compensate for speaker and phonetic variability. Classification results show that i-vectors formed using an over-sampling methodology outperform a baseline set by KL-means supervectors. However the effect of these two compensation methods does not appear to improve system accuracy. Visualisations afforded by the t-Distributed Stochastic Neighbour Embedding (t-SNE) technique suggest that despite the application of these techniques, speaker variability is still a strong confounding effect.
引用
收藏
页数:5
相关论文
共 13 条
  • [1] I-vectors for image classification
    Smith, David C.
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XXXVII, 2014, 9217
  • [2] i-Vectors in speech processing applications: a survey
    Verma, Pulkit
    Das, Pradip
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (04) : 529 - 546
  • [3] Robust Spoofed Speech Detection with Denoised I-vectors
    Disken, Gokay
    GAZI UNIVERSITY JOURNAL OF SCIENCE, 2023, 36 (04): : 1553 - 1561
  • [4] Modeling Spectral Variability for the Classification of Depressed Speech
    Cummins, Nicholas
    Epps, Julien
    Sethu, Vidhyasaharan
    Breakspear, Michael
    Goecke, Roland
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 857 - 861
  • [5] Speaker age classification and regression using i-vectors
    Grzybowska, Joanna
    Kacprzak, Stanislaw
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1402 - 1406
  • [6] Audio-Visual Speech Separation Using I-Vectors
    Luo, Yiyu
    Wang, Jing
    Wang, Xinyao
    Wen, Liang
    Wang, Lizhong
    2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 276 - 280
  • [7] Duration compensation of i-vectors for short duration speaker verification
    Ma, Jianbo
    Sethu, Vidhyasaharan
    Ambikairajah, Eliathamby
    Lee, Kong Aik
    ELECTRONICS LETTERS, 2017, 53 (06) : 405 - 407
  • [8] Intersession compensation and scoring methods in the i-vectors space for speaker recognition
    Bousquet, Pierre-Michel
    Matrouf, Driss
    Bonastre, Jean-Francois
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 492 - 495
  • [9] IMPROVED SPEAKER RECOGNITION WHEN USING I-VECTORS FROM MULTIPLE SPEECH SOURCES
    McLaren, Mitchell
    van Leeuwen, David
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5460 - 5463
  • [10] Automatic Evaluation of Speech Intelligibility Based on i-vectors in the Context of Head and Neck Cancers
    Laaridh, Imed
    Fredouille, Corinne
    Ghio, Alain
    Lalain, Muriel
    Woisard, Virginie
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2943 - 2947