Non-Intrusive POLQA Estimation of Speech Quality using Recurrent Neural Networks

被引:3
|
作者
Sharma, Dushyant [1 ]
Hogg, Aidan O. T. [2 ]
Wang, Yu [3 ]
Nour-Eldin, Amr [1 ]
Naylor, Patrick A. [2 ]
机构
[1] Nuance Commun, Burlington, MA 01803 USA
[2] Imperial Coll London, Dept Elect & Elect Engn, London, England
[3] Univ Cambridge, Dept Engn, Cambridge, England
来源
2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2019年
关键词
speech quality estimation; POLQA estimation; deep neural networks; INTELLIGIBILITY; CHANNELS; STANDARD;
D O I
10.23919/eusipco.2019.8902646
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Estimating the quality of speech without the use of a clean reference signal is a challenging problem, in part due to the time and expense required to collect sufficient training data for modern machine learning algorithms. We present a novel, non-intrusive estimator that exploits recurrent neural network architectures to predict the intrusive POLQA score of a speech signal in a short time context. The predictor is based on a novel compressed representation of modulation domain features, used in conjunction with static MFCC features. We show that the proposed method can reliably predict POLQA with a 300 ms context, achieving a mean absolute error of 0.21 on unseen data. The proposed method is trained using English speech and is shown to generalize well across unseen languages. The neural network also jointly estimates the mean voice activity detection (VAD) with an F1 accuracy score of 0.9, removing the need for an external VAD.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS
    Avila, Anderson R.
    Gamper, Hannes
    Reddy, Chandan
    Cutler, Ross
    Tashev, Ivan
    Gehrke, Johannes
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 631 - 635
  • [2] Non-intrusive speech quality prediction in VoIP networks using a neural network approach
    Al-Akhras, M.
    Zedan, H.
    John, R.
    ALmomani, I.
    NEUROCOMPUTING, 2009, 72 (10-12) : 2595 - 2608
  • [3] Non-intrusive speech quality assessment using context-aware neural networks
    Jaiswal R.K.
    Dubey R.K.
    International Journal of Speech Technology, 2022, 25 (04) : 947 - 965
  • [4] Non-Intrusive Speech Quality Assessment Based on Deep Neural Networks for Speech Communication
    Liu, Miao
    Wang, Jing
    Wang, Fei
    Xiang, Fei
    Chen, Jingdong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 174 - 187
  • [5] INTRUSIVE AND NON-INTRUSIVE PERCEPTUAL SPEECH QUALITY ASSESSMENT USING A CONVOLUTIONAL NEURAL NETWORK
    Gamper, Hannes
    Reddy, Chandan K. A.
    Cutler, Ross
    Tashev, Ivan J.
    Gehrke, Johannes
    2019 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2019, : 85 - 89
  • [6] Transformer Networks for Non-Intrusive Speech Quality Prediction
    Jayesh, M. K.
    Sharma, Mukesh
    Vonteddu, Praneeth
    Shaik, M. A. B.
    Ganapathy, Sriram
    INTERSPEECH 2022, 2022, : 4078 - 4082
  • [7] Non-Intrusive Estimation of Packet Loss Rates in Speech Communication Systems Using Convolutional Neural Networks
    Mittag, Gabriel
    Moeller, Sebastian
    2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018), 2018, : 105 - 109
  • [8] Predicting score distribution to improve non-intrusive speech quality estimation
    Faridee, Abu Zaher Md
    Gamper, Hannes
    INTERSPEECH 2022, 2022, : 406 - 410
  • [9] Comparing neural network architectures for non-intrusive speech quality prediction
    Schill, Leif Forland
    Piechowiak, Tobias
    Laroche, Clement
    Mowlaee, Pejman
    SPEECH COMMUNICATION, 2024, 165
  • [10] NON-INTRUSIVE ESTIMATION OF THE LEVEL OF REVERBERATION IN SPEECH
    Parada, Pablo Peso
    Sharma, Dushyant
    Naylor, Patrick A.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,