Using fuzzy string matching for automated assessment of listener transcripts in speech intelligibility studies

被引:20
作者
Bosker, Hans Rutger [1 ,2 ]
机构
[1] Max Planck Inst Psycholinguist, POB 310, NL-6500 AH Nijmegen, Netherlands
[2] Radboud Univ Nijmegen, Donders Inst Brain Cognit & Behav, Nijmegen, Netherlands
关键词
Speech intelligibility; Fuzzy string matching; Transcription accuracy; Automated assessment; Token sort ratio; PERCEPTION; CHILDREN; NOISE;
D O I
10.3758/s13428-021-01542-4
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
Many studies of speech perception assess the intelligibility of spoken sentence stimuli by means of transcription tasks ('type out what you hear'). The intelligibility of a given stimulus is then often expressed in terms of percentage of words correctly reported from the target sentence. Yet scoring the participants' raw responses for words correctly identified from the target sentence is a time-consuming task, and hence resource-intensive. Moreover, there is no consensus among speech scientists about what specific protocol to use for the human scoring, limiting the reliability of human scores. The present paper evaluates various forms of fuzzy string matching between participants' responses and target sentences, as automated metrics of listener transcript accuracy. We demonstrate that one particular metric, the token sort ratio, is a consistent, highly efficient, and accurate metric for automated assessment of listener transcripts, as evidenced by high correlations with human-generated scores (best correlation: r= 0.940) and a strong relationship to acoustic markers of speech intelligibility. Thus, fuzzy string matching provides a practical tool for assessment of listener transcript accuracy in large-scale speech intelligibility studies. See https://tokensortratio.netlify.app for an online implementation.
引用
收藏
页码:1945 / 1953
页数:9
相关论文
共 29 条
  • [1] Impact of sentence length and phonetic complexity on intelligibility of 5-year-old children with cerebral palsy
    Allison, Kristen M.
    Hustad, Katherine C.
    [J]. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2014, 16 (04) : 396 - 407
  • [2] Baayen R. H., 2008, ANAL LINGUISTIC DATA, DOI [DOI 10.1017/CBO9780511801686, 10.1017/CBO9780511801686]
  • [3] Autoscore: An open-source automated tool for scoring listener perception of speech
    Borrie, Stephanie A.
    Barrett, Tyson S.
    Yoho, Sarah E.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (01) : 392 - 399
  • [4] Temporal contrast effects in human speech perception are immune to selective attention
    Bosker, Hans Rutger
    Sjerps, Matthias J.
    Reinisch, Eva
    [J]. SCIENTIFIC REPORTS, 2020, 10 (01)
  • [5] Enhanced amplitude modulations contribute to the Lombard intelligibility benefit: Evidence from the Nijmegen Corpus of Lombard Speech
    Bosker, Hans Rutger
    Cooke, Martin
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 147 (02) : 721 - 730
  • [6] Spectral contrast effects are modulated by selective attention in "cocktail party" settings
    Bosker, Hans Rutger
    Sjerps, Matthias J.
    Reinisch, Eva
    [J]. ATTENTION PERCEPTION & PSYCHOPHYSICS, 2020, 82 (03) : 1318 - 1332
  • [7] Talkers produce more pronounced amplitude modulations when speaking in noise
    Bosker, Hans Rutger
    Cooke, Martin
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (02) : EL121 - EL126
  • [8] Chung L, 2005, TRANSLATION REV, P55
  • [9] EFFECTS OF AMBIENT NOISE ON SPEAKER INTELLIGIBILITY FOR WORDS AND PHRASES
    DREHER, JJ
    ONEILL, JJ
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1957, 29 (12) : 1320 - 1323
  • [10] The Brain Dynamics of Rapid Perceptual Adaptation to Adverse Listening Conditions
    Erb, Julia
    Henry, Molly J.
    Eisner, Frank
    Obleser, Jonas
    [J]. JOURNAL OF NEUROSCIENCE, 2013, 33 (26) : 10688 - +