Is automatic speech-to-text transcription ready for use in psychological experiments?

被引:16
作者
Ziman, Kirsten [1 ]
Heusser, Andrew C. [1 ]
Fitzpatrick, Paxton C. [1 ]
Field, Campbell E. [1 ]
Manning, Jeremy R. [1 ]
机构
[1] Dartmouth Coll, Dept Psychol & Brain Sci, Hanover, NH 03755 USA
关键词
Annotation; Free recall; Mechanical Turk; Memory; Speech-to-text; Verbal response; BEHAVIORAL-EXPERIMENTS; REHEARSAL; MODELS;
D O I
10.3758/s13428-018-1037-4
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
Verbal responses are a convenient and naturalistic way for participants to provide data in psychological experiments (Salzinger, The Journal of General Psychology, 61(1),65-94:1959). However, audio recordings of verbal responses typically require additional processing, such as transcribing the recordings into text, as compared with other behavioral response modalities (e.g., typed responses, button presses, etc.). Further, the transcription process is often tedious and time-intensive, requiring human listeners to manually examine each moment of recorded speech. Here we evaluate the performance of a state-of-the-art speech recognition algorithm (Halpern et al., 2016) in transcribing audio data into text during a list-learning experiment. We compare transcripts made by human annotators to the computer-generated transcripts. Both sets of transcripts matched to a high degree and exhibited similar statistical properties, in terms of the participants' recall performance and recall dynamics that the transcripts captured. This proof-of-concept study suggests that speech-to-text engines could provide a cheap, reliable, and rapid means of automatically transcribing speech data in psychological experiments. Further, our findings open the door for verbal response experiments that scale to thousands of participants (e.g., administered online), as well as a new generation of experiments that decode speech on the fly and adapt experimental parameters based on participants' prior responses.
引用
收藏
页码:2597 / 2605
页数:9
相关论文
共 37 条
[1]   EEG neurofeedback: A brief overview and an example of peak alpha frequency training for cognitive enhancement in the elderly [J].
Angelakis, Efthymios ;
Stathopoulou, Stamatina ;
Frymiare, Jennifer L. ;
Green, Deborah L. ;
Lubar, Joel F. ;
Kounios, John .
CLINICAL NEUROPSYCHOLOGIST, 2007, 21 (01) :110-129
[2]  
[Anonymous], 2017, The Journal of Open Source Software, DOI DOI 10.21105/JOSS.00424
[3]  
[Anonymous], ENCHANTED LEARNING
[4]  
[Anonymous], PENN TOTALRECALL
[5]  
[Anonymous], 2018, ARXIV180101944
[6]  
Bamberg P., 1990, Proceedings of the DARPA Speech and Natural Language Workshop, P78
[7]   Amazon's Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data? [J].
Buhrmester, Michael ;
Kwang, Tracy ;
Gosling, Samuel D. .
PERSPECTIVES ON PSYCHOLOGICAL SCIENCE, 2011, 6 (01) :3-5
[8]   Real-time functional magnetic resonance imaging [J].
Cohen, MS .
METHODS, 2001, 25 (02) :201-220
[9]   STAIRCASE-METHOD IN PSYCHOPHYSICS [J].
CORNSWEET, TN .
AMERICAN JOURNAL OF PSYCHOLOGY, 1962, 75 (03) :485-&
[10]  
Cox RW, 1999, MAGNET RESON MED, V42, P1014, DOI 10.1002/(SICI)1522-2594(199912)42:6<1014::AID-MRM4>3.0.CO