Self-conducted speech audiometry using automatic speech recognition: Simulation results for listeners with hearing loss

被引:8
作者
Ooster, Jasper [1 ,3 ]
Tuschen, Laura [2 ]
Meyer, Bernd T. [1 ,3 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, Commun Acoust, D-26129 Oldenburg, Germany
[2] Fraunhofer Inst Digital Media Technol IDMT, Oldenburg Branch Hearing Speech & Audio Technol HS, D-26129 Oldenburg, Germany
[3] Cluster Excellence Hearing4all, Oldenburg, Germany
关键词
Speech audiometry; Automatic speech recognition; Matrix sentence test; Unsupervised measurement; NOISE; INTELLIGIBILITY; VALIDATION; SENTENCES; TESTS;
D O I
10.1016/j.csl.2022.101447
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech-in-noise tests are an important tool for assessing hearing impairment, the successful fitting of hearing aids, as well as for research in psychoacoustics. An important drawback of many speech-based tests is the requirement of an expert to be present during the measurement, in order to assess the listener's performance. This drawback may be largely overcome through the use of automatic speech recognition (ASR), which utilizes automatic response logging. However, such an unsupervised system may reduce the accuracy due to the introduction of potential errors. In this study, two different ASR systems are compared for automated testing: A system with a feed-forward deep neural network (DNN) from a previous study (Ooster et al., 2018), as well as a state-of-the-art system utilizing a time-delay neural network (TDNN). The dynamic measurement procedure of the speech intelligibility test was simulated considering the subjects' hearing loss and selecting from real recordings of test participants. The ASR systems' performance is investigated based on responses of 73 listeners, ranging from normal -hearing to severely hearing-impaired as well as read speech from cochlear implant listeners. The feed-forward DNN produced accurate testing results for NH and unaided HI listeners but a decreased measurement accuracy was found in the simulation of the adaptive measurement procedure when considering aided severely HI listeners, recorded in noisy environments with a loudspeaker setup. The TDNN system produces error rates of 0.6% and 3.0% for deletion and insertion errors, respectively. We estimate that the SRT deviation with this system is below 1.38 dB for 95% of the users. This result indicates that a robust unsupervised conduction of the matrix sentence test is possible with a similar accuracy as with a human supervisor even when considering noisy conditions and altered or disordered speech from elderly severely HI listeners and listeners with a CI.
引用
收藏
页数:14
相关论文
共 50 条
[31]   Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques [J].
Qin, Ying ;
Lee, Tan ;
Kong, Anthony Pak Hin ;
Law, Sam Po .
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[32]   How directional microphones affect speech recognition, listening effort and localisation for listeners with moderate-to-severe hearing loss [J].
Picou, Erin M. ;
Ricketts, Todd A. .
INTERNATIONAL JOURNAL OF AUDIOLOGY, 2017, 56 (12) :909-918
[33]   Effect of Speaking Rate on Recognition of Synthetic and Natural Speech by Normal-Hearing and Cochlear Implant Listeners [J].
Ji, Caili ;
Galvin, John J., III ;
Xu, Anting ;
Fu, Qian-Jie .
EAR AND HEARING, 2013, 34 (03) :313-323
[34]   Effects of entropy in real-world noise on speech perception in listeners with normal hearing and hearing loss [J].
Jorgensen, Erik ;
Wu, Yu-Hsiang .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 154 (06) :3627-3643
[35]   The effect of wearing face mask on speech intelligibility in listeners with sensorineural hearing loss and normal hearing sensitivity [J].
Alkharabsheh, Ana'am ;
Aboudi, Ola ;
Abdulbaqi, Khader ;
Garadat, Soha .
INTERNATIONAL JOURNAL OF AUDIOLOGY, 2023, 62 (04) :328-333
[36]   Evaluating Speech Intelligibility for Cochlear Implants Using Automatic Speech Recognition [J].
Zhou, Hengzhi ;
Shi, Mingyue ;
Meng, Qinglin .
2024 IEEE 14TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, ISCSLP 2024, 2024, :1-5
[37]   Using Clinician Annotations to Improve Automatic Speech Recognition of Stuttered Speech [J].
Heeman, Peter A. ;
Lunsford, Rebecca ;
McMillin, Andy ;
Yaruss, J. Scott .
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :2651-2655
[38]   Feasibility of Using Automatic Speech Recognition with Voices of Deaf and Hard-of-Hearing Individuals [J].
Glasser, Abraham T. ;
Kushalnagar, Kesavan R. ;
Kushalnagar, Raja S. .
PROCEEDINGS OF THE 19TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY (ASSETS'17), 2017, :373-374
[39]   Validating self-reporting of hearing-related symptoms against pure-tone audiometry, otoacoustic emission, and speech audiometry [J].
Fredriksson, Sofie ;
Hammar, Oscar ;
Magnusson, Lennart ;
Kahari, Kim ;
Waye, Kerstin Persson .
INTERNATIONAL JOURNAL OF AUDIOLOGY, 2016, 55 (08) :454-462
[40]   EVALUATION OF SPEECH ENHANCEMENT BASED ON PRE-IMAGE ITERATIONS USING AUTOMATIC SPEECH RECOGNITION [J].
Leitner, Christina ;
Morales-Cordovilla, Juan A. ;
Pernkopf, Franz .
2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, :1801-1805