A data-driven non-intrusive measure of speech quality and intelligibility

被引:46
作者
Sharma, Dushyant [1 ]
Wang, Yu [2 ]
Naylor, Patrick A. [3 ]
Brookes, Mike [3 ]
机构
[1] Nuance Commun Inc, Sunnyvale, CA 94085 USA
[2] Univ Cambridge, Cambridge, England
[3] Imperial Coll, London, England
关键词
Speech quality; Speech intelligibility; CART; PESQ; STOI; PATTERN-RECOGNITION; NOISE; CLASSIFICATION; REVERBERANT;
D O I
10.1016/j.specom.2016.03.005
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech signals are often affected by additive noise and distortion which can degrade the perceived quality and intelligibility of the signal. We present a new measure, NISA, for estimating the quality and intelligibility of speech degraded by additive noise and distortions associated with telecommunications networks, based on a data driven framework of feature extraction and tree based regression. The new measure is non-intrusive, operating on the degraded signal alone without the need for a reference signal. This makes the measure applicable to practical speech processing applications operating in the single-ended mode. The new measure has been evaluated against the intrusive measures PESQ and STOI. The results indicate that the accuracy of the new non-intrusive method is around 90% of the accuracy of the intrusive measures, depending on the test scenario. The NISA measure therefore provides non-intrusive (single-ended) PESQ and STOI estimates with high accuracy. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:84 / 94
页数:11
相关论文
共 67 条
[1]  
American National Standard Institute, 1969, ANSI S3.5-1969
[2]  
[Anonymous], 2004, SINGLE ENDED METHOD, DOI [10.1109/ICASSP.2006.1660151, DOI 10.1109/ICASSP.2006.1660151]
[3]  
[Anonymous], 1996, Methods for Subjective Determination of Transmission Quality
[4]  
[Anonymous], 1993, ITU T RECOMMENDATION
[5]  
[Anonymous], 1999, ART VOIC
[6]  
[Anonymous], PERC OBJ LIST QUAL A
[7]  
[Anonymous], 2001, ITU-T Rec. P. 862
[8]  
ANSI, 1997, S351997R2007 ANSI
[9]   PATTERN-RECOGNITION APPROACH TO VOICED UNVOICED SILENCE CLASSIFICATION WITH APPLICATIONS TO SPEECH RECOGNITION [J].
ATAL, BS ;
RABINER, LR .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (03) :201-212
[10]  
BEERENDS JG, 1994, J AUDIO ENG SOC, V42, P115