SPEECH INTELLIGIBILITY PREDICTION AS A CLASSIFICATION PROBLEM

被引:0
|
作者
Andersen, Asger Heidemann [1 ]
Schoenmaker, Esther [2 ]
van de Par, Steven [2 ]
机构
[1] Oticon AS, DK-2765 Smorum, Denmark
[2] Carl von Ossietzky Univ Oldenburg, Cluster Excellence Hearing4all, Dept Med Phys & Acoust, D-26111 Oldenburg, Germany
来源
2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP) | 2016年
关键词
Speech intelligibility prediction; speech enhancement; binary classification; applications of machine learning; RECEPTION THRESHOLD; NOISE; RECOGNITION; PERCEPTION; INDEX;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speech Intelligibility Prediction (SIP) algorithms are becoming increasingly popular for objective evaluation of speech processing algorithms and transmission systems. Most often, SIP algorithms aim to predict the average intelligibility of an average listener in some specific listening condition. In the present work, we instead consider the aim of predicting the intelligibility of singlewords. I.e. we attempt to predict whether or not a subject in a listening experiment was able to correctly repeat a particular word. We base the prediction on a noisy and potentially processed/degraded recording of the spoken word (as presented to a subject), as well as a clean reference recording of the spoken word. The problem can be treated as a supervised binary classification problem of predicting whether a specific word will or will not be understood. We investigate a number of different ways to extract features from the degraded and clean speech samples. The classification is carried out by means of Fisher discriminant analysis. Despite the large variability of speech intelligibility experiments, it is possible to obtain a considerable degree of predictive power.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] ASR-based speech intelligibility prediction: A review
    Karbasi, Mahdie
    Kolossa, Dorothea
    HEARING RESEARCH, 2022, 426
  • [2] Nonintrusive Speech Intelligibility Prediction Using Convolutional Neural Networks
    Andersen, Asger Heidemann
    de Haan, Jan Mark
    Tan, Zheng-Hua
    Jensen, Jesper
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1925 - 1939
  • [3] Speech Intelligibility Prediction Based on Mutual Information
    Jensen, Jesper
    Taal, Cees H.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (02) : 430 - 440
  • [4] NON-INTRUSIVE BINAURAL PREDICTION OF SPEECH INTELLIGIBILITY BASED ON PHONEME CLASSIFICATION
    Rossbach, Jana
    Roettges, Saskia
    Hauth, Christopher F.
    Brand, Thomas
    Meyer, Bernd T.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 396 - 400
  • [5] Microscopic and Blind Prediction of Speech Intelligibility: Theory and Practice
    Karbasi, Mahdie
    Zeiler, Steffen
    Kolossa, Dorothea
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2141 - 2155
  • [6] Predicting speech intelligibility with deep neural networks
    Spille, Constantin
    Ewert, Stephan D.
    Kollmeier, Birger
    Meyer, Bernd T.
    COMPUTER SPEECH AND LANGUAGE, 2018, 48 : 51 - 66
  • [7] Envelope and intensity based prediction of psychoacoustic masking and speech intelligibility
    Biberger, Thomas
    Ewert, Stephan D.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 140 (02): : 1023 - 1038
  • [8] Multilingual non-intrusive binaural intelligibility prediction based on phone classification
    Rossbach, Jana
    Wagener, Kirsten C.
    Meyer, Bernd T.
    COMPUTER SPEECH AND LANGUAGE, 2025, 89
  • [9] Spectro-temporal modulation glimpsing for speech intelligibility prediction
    Edraki, Amin
    Chan, Wai-Yip
    Jensen, Jesper
    Fogerty, Daniel
    HEARING RESEARCH, 2022, 426
  • [10] The Modulation Transfer Function for Speech Intelligibility
    Elliott, Taffeta M.
    Theunissen, Frederic E.
    PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (03)