An Evaluation of Output Signal to Noise Ratio as a Predictor of Cochlear Implant Speech Intelligibility

被引:10
作者
Watkins, Greg D. [1 ]
Swanson, Brett A. [2 ]
Suaning, Gregg J. [1 ]
机构
[1] Univ Sydney, Sch Aerosp Mech & Mechatro Engn, Fac Engn & Informat Technol, Sydney, NSW, Australia
[2] Cochlear Ltd, Sydney, NSW, Australia
关键词
Cochlear implants; Output signal to noise; Prediction; Speech intelligibility; PSYCHOMETRIC FUNCTION; RECEPTION THRESHOLD; RECOGNITION; STIMULATION; SENTENCES; PHONEME;
D O I
10.1097/AUD.0000000000000556
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Objectives: Cochlear implant (CI) sound processing strategies are usually evaluated in clinical studies involving experienced implant recipients. Metrics which estimate the capacity to perceive speech for a given set of audio and processing conditions provide an alternative means to assess the effectiveness of processing strategies. The aim of this research was to assess the ability of the output signal to noise ratio (OSNR) to accurately predict speech perception. It was hypothesized that compared with the other metrics evaluated in this study (1) OSNR would have equivalent or better accuracy and (2) OSNR would be the most accurate in the presence of variable levels of speech presentation. Design: For the first time, the accuracy of OSNR as a metric which predicts speech intelligibility was compared, in a retrospective study, with that of the input signal to noise ratio (ISNR) and the short-term objective intelligibility (STOI) metric. Because STOI measured audio quality at the input to a CI sound processor, a vocoder was applied to the sound processor output and STOI was also calculated for the reconstructed audio signal (vocoder short-term objective intelligibility [VSTOI] metric). The figures of merit calculated for each metric were Pearson correlation of the metric and a psychometric function fitted to sentence scores at each predictor value (Pearson sigmoidal correlation [PSIG]), epsilon insensitive root mean square error (RMSE*) of the psychometric function and the sentence scores, and the statistical deviance of the fitted curve to the sentence scores (D). Sentence scores were taken from three existing data sets of Australian Sentence Tests in Noise results. The AuSTIN tests were conducted with experienced users of the Nucleus CI system. The score for each sentence was the proportion of morphemes the participant correctly repeated. In data set 1, all sentences were presented at 65 dB sound pressure level (SPL) in the presence of four-talker Babble noise. Each block of sentences used an adaptive procedure, with the speech presented at a fixed level and the ISNR varied. In data set 2, sentences were presented at 65 dB SPL in the presence of stationary speech weighted noise, street-side city noise, and cocktail party noise. An adaptive ISNR procedure was used. In data set 3, sentences were presented at levels ranging from 55 to 89 dB SPL with two automatic gain control configurations and two fixed ISNRs. Results: For data set 1, the ISNR and OSNR were equally most accurate. STOI was significantly different for deviance (p = 0.045) and RMSE* (p < 0.001). VSTOI was significantly different for RMSE* (p < 0.001). For data set 2, ISNR and OSNR had an equivalent accuracy which was significantly better than that of STOI for PSIG (p = 0.029) and VSTOI for deviance (p = 0.001), RMSE*, and PSIG (both p < 0.001). For data set 3, OSNR was the most accurate metric and was significantly more accurate than VSTOI for deviance, RMSE*, and PSIG (all p < 0.001). ISNR and STOI were unable to predict the sentence scores for this data set. Conclusions: The study results supported the hypotheses. OSNR was found to have an accuracy equivalent to or better than ISNR, STOI, and VSTOI for tests conducted at a fixed presentation level and variable ISNR. OSNR was a more accurate metric than VSTOI for tests with fixed ISNRs and variable presentation levels. Overall, OSNR was the most accurate metric across the three data sets. OSNR holds promise as a prediction metric which could potentially improve the effectiveness of sound processor research and CI fitting.
引用
收藏
页码:958 / 968
页数:11
相关论文
共 34 条
[1]  
ANSI, 1997, METH CALC SPEECH INT, pS35
[2]   MATHEMATICAL TREATMENT OF CONTEXT EFFECTS IN PHONEME AND WORD RECOGNITION [J].
BOOTHROYD, A ;
NITTROUER, S .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1988, 84 (01) :101-114
[3]   STARR: A Speech Test for Evaluation of the Effectiveness of Auditory Prostheses Under Realistic Conditions [J].
Boyle, Patrick J. ;
Nunn, Terry B. ;
O'Connor, Alec Fitzgerald ;
Moore, Brian C. J. .
EAR AND HEARING, 2013, 34 (02) :203-212
[4]   An Adaptive Australian Sentence Test in Noise (AuSTIN) [J].
Dawson, Pam W. ;
Hersbach, Adam A. ;
Swanson, Brett A. .
EAR AND HEARING, 2013, 34 (05) :592-600
[5]   Clinical Evaluation of Signal-to-Noise Ratio-Based Noise Reduction in Nucleus® Cochlear Implant Recipients [J].
Dawson, Pam W. ;
Mauger, Stefan J. ;
Hersbach, Adam A. .
EAR AND HEARING, 2011, 32 (03) :382-390
[6]   Effects of presentation level on phoneme and sentence recognition in quiet by cochlear implant listeners [J].
Donaldson, GS ;
Allen, SL .
EAR AND HEARING, 2003, 24 (05) :392-405
[7]   Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs [J].
Dorman, MF ;
Loizou, PC ;
Rainey, D .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 102 (04) :2403-2411
[8]   Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices [J].
Falk, Tiago H. ;
Parsa, Vijay ;
Santos, Joao F. ;
Arehart, Kathryn ;
Hazrati, Oldooz ;
Huber, Rainer ;
Kates, James M. ;
Scollie, Susan .
IEEE SIGNAL PROCESSING MAGAZINE, 2015, 32 (02) :114-124
[9]  
Fisher R. A., 1921, METRON, V1, P3
[10]   FACTORS GOVERNING THE INTELLIGIBILITY OF SPEECH SOUNDS [J].
FRENCH, NR ;
STEINBERG, JC .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1947, 19 (01) :90-119