Golden speaker builder - An interactive tool for pronunciation training

被引:36
作者
Ding, Shaojin [1 ]
Liberatore, Christopher [1 ]
Sonsaat, Sinem [2 ]
Lucic, Ivana [2 ]
Silpachai, Alif [2 ]
Zhao, Guanlong [1 ]
Chukharev-Hudilainen, Evgeny [2 ]
Levis, John [2 ]
Gutierrez-Osuna, Ricardo [1 ]
机构
[1] Texas A&M Univ, Dept Comp Sci & Engn, College Stn, TX 77843 USA
[2] Iowa State Univ, Dept English, Ames, IA USA
关键词
LINEAR TRANSFORMATION; EXPLICIT CORRECTION; FOREIGN ACCENT; LEARNER REPAIR; ERROR TYPES; FLUENCY; COMPREHENSIBILITY; RECASTS; SPEECH; NEGOTIATION;
D O I
10.1016/j.specom.2019.10.005
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The type of voice model used in Computer Assisted Pronunciation Instruction is a crucial factor in the quality of practice and the amount of uptake by language learners. As an example, prior research indicates that second-language learners are more likely to succeed when they imitate a speaker with a voice similar to their own, a so-called "golden speaker". This manuscript presents Golden Speaker Builder (GSB), a tool that allows learners to generate a personalized "golden-speaker" voice: one that mirrors their own voice but with a native accent. We describe the overall system design, including the web application with its user interface, and the underlying speech analysis/synthesis algorithms. Next, we present results from a series of listening tests, which show that GSB is capable of synthesizing such golden-speaker voices. Finally, we present results from a user study in a language-instruction setting, which show that practising with GSB leads to improved fluency and comprehensibility. We suggest reasons for why learners improved as they did and recommendations for the next iteration of the training.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 90 条
[61]   The effects of speaking rate on listener evaluations of native and foreign-accented speech [J].
Munro, MJ ;
Derwing, TM .
LANGUAGE LEARNING, 1998, 48 (02) :159-182
[62]  
Nagano K., 1990, P 1 INT C SPOK LANG
[63]   A CORPUS-BASED STUDY OF REPAIR CUES IN SPONTANEOUS SPEECH [J].
NAKATANI, CH ;
HIRSCHBERG, J .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (03) :1603-1616
[64]   Recasts as feedback to language learners [J].
Nicholas, H ;
Lightbown, PM .
LANGUAGE LEARNING, 2001, 51 (04) :719-758
[65]   Phonological memory predicts second language oral fluency gains in adults [J].
O'Brien, Irena ;
Segalowitz, Norman ;
Freed, Barbara ;
Collentine, Joe .
STUDIES IN SECOND LANGUAGE ACQUISITION, 2007, 29 (04) :557-581
[66]   Frequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC [J].
Panchapagesan, Sankaran ;
Alwan, Abeer .
COMPUTER SPEECH AND LANGUAGE, 2009, 23 (01) :42-64
[67]  
Peabody M, 2006, LECT NOTES COMPUT SC, V4274, P602
[68]  
Pelham B.W., 2012, Conducting research in psychology: Measuring the weight of smoke
[69]  
Pellegrino E., 2015, P WORKSH SPEECH LANG
[70]   Vocal tract normalization equals linear transformation in cepstral space [J].
Pitz, M ;
Ney, H .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05) :930-944