Measuring the cognitive load of synthetic speech using a dual task paradigm

被引:9
作者
Govender, Avashna [1 ]
King, Simon [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
基金
欧盟地平线“2020”;
关键词
cognitive load; dual task paradigm; speech synthesis; LISTENING EFFORT; FATIGUE;
D O I
10.21437/Interspeech.2018-1199
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a methodology for measuring the cognitive load (listening effort) of synthetic speech using a dual task paradigm. Cognitive load is calculated from changes in a listener's performance on a secondary task (e.g., reaction time to decide if a visually-displayed digit is odd or even). Previous related studies have only found significant differences between the best and worst quality systems but failed to separate the systems that lie in between. A paradigm that is sensitive enough to detect differences between state-of-the-art, high quality speech synthesizers would be very useful for advancing the state of the art. In our work, four speech synthesis systems from a previous Blizzard Challenge, and the corresponding natural speech, were compared. Our results show that reaction times slow down as speech quality reduces, as we expected: lower quality speech imposes a greater cognitive load, taking resources away from the secondary task. However, natural speech did not have the fastest reaction times. This intriguing result might indicate that, as speech synthesizers attain near-perfect intelligibility, this paradigm is measuring something like the listener's level of sustained attention and not listening effort.
引用
收藏
页码:2843 / 2847
页数:5
相关论文
共 24 条
  • [1] [Anonymous], 2012, Psychology software tools
  • [2] [Anonymous], 2011, BLIZZARD CHALLENGE
  • [3] [Anonymous], 1973, ATTENTION EFFORT
  • [4] Black Alan W., 2005, P ANN C INT SPEECH C, P77, DOI [DOI 10.21437/INTERSPEECH.2005-72, 10.21437/Interspeech.2005-72]
  • [5] Cognitive factors in the evaluation of synthetic speech
    Delogu, C
    Conte, S
    Sementina, C
    [J]. SPEECH COMMUNICATION, 1998, 24 (02) : 153 - 168
  • [6] Age-Related Changes in Listening Effort for Various Types of Masker Noises
    Desjardins, Jamie L.
    Doherty, Karen A.
    [J]. EAR AND HEARING, 2013, 34 (03) : 261 - 272
  • [7] COMPREHENSION OF SYNTHETIC SPEECH PRODUCED BY RULE - A REVIEW AND THEORETICAL INTERPRETATION
    DUFFY, SA
    PISONI, DB
    [J]. LANGUAGE AND SPEECH, 1992, 35 : 351 - 389
  • [8] Evaluating the Effort Expended to Understand Speech in Noise Using a Dual-Task Paradigm: The Effects of Providing Visual Speech Cues
    Fraser, Sarah
    Gagne, Jean-Pierre
    Alepins, Majolaine
    Dubois, Pascale
    [J]. JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2010, 53 (01): : 18 - 33
  • [9] Behavioral Assessment of Listening Effort Using a Dual-Task Paradigm: A Review
    Gagne, Jean-Pierre
    Besser, Jana
    Lemke, Ulrike
    [J]. TRENDS IN HEARING, 2017, 21 : 1 - 25
  • [10] Older adults expend more listening effort than young adults recognizing audiovisual speech in noise
    Gosselin, Penny Anderson
    Gagne, Jean-Pierre
    [J]. INTERNATIONAL JOURNAL OF AUDIOLOGY, 2011, 50 (11) : 786 - 792