A review of tools and techniques for computer aided pronunciation training (CAPT) in English

被引:33
作者
Agarwal, Chesta [1 ]
Chakraborty, Pinaki [1 ]
机构
[1] Netaji Subhas Univ Technol, Div Comp Engn, New Delhi, India
关键词
Educational software; Computer aided pronunciation training (CAPT); English as a second language; English as a foreign language; Phonetics; LANGUAGE; RECOGNITION; DISCOVERY; SPEECH;
D O I
10.1007/s10639-019-09955-7
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Widespread use of English in the academia and in business is leading an increasing number of people to learn it as a second or a foreign language. Computer aided pronunciation training (CAPT) systems are used by non-native English speakers for improving their English pronunciation. A typical CAPT tool records the speech of a learner, detects and diagnoses mispronunciations in it, and suggests a way for correcting them. We classified the CAPT systems for English into four categories on the basis of the technology used in them and studied the salient features of each such category. We observed that visual simulation based systems are suitable for young and naive learners, game based systems are advantageous as they can be personalized as per the requirements of the learners, comparative phonetics based systems are suitable for adult learners fluent in another language, and artificial neural network based systems have the highest accuracy in mispronunciation diagnosis and are suitable for experienced and professional learners. We identified the state-of-the-art practices used in CAPT systems, and observed that CAPT systems can detect up to 86% mispronunciations in a speech and help learners to lessen mispronouncing by up to 23%. We recommend collaboration between language teachers and software developers to develop CAPT tools, their wide dissemination and integration with the curriculum at school and university levels, and further investigation on mobile and collaborative CAPT systems.
引用
收藏
页码:3731 / 3743
页数:13
相关论文
共 26 条
[11]   EXPERIMENTS WITH COMPUTER-CONTROLLED DISPLAYS IN SECOND-LANGUAGE LEARNING [J].
KALIKOW, DN ;
SWETS, JA .
IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1972, AU20 (01) :23-&
[12]   Personalizing Recurrent-Neural-Network-Based Language Model by Social Network [J].
Lee, Hung-Yi ;
Tseng, Bo-Hsiang ;
Wen, Tsung-Hsien ;
Tsao, Yu .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (03) :519-530
[13]   Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks [J].
Li, Kun ;
Qian, Xiaojun ;
Meng, Helen .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) :193-207
[14]   The affordance of speech recognition technology for EFL learning in an elementary school setting [J].
Liaw, Meei-Ling .
INNOVATION IN LANGUAGE LEARNING AND TEACHING, 2014, 8 (01) :79-93
[15]   Viewing speech in action: speech articulation videos in the public domain that demonstrate the sounds of the International Phonetic Alphabet (IPA) [J].
Nakai, Satsuki ;
Beavan, David ;
Lawson, Eleanor ;
Leplatre, Gregory ;
Scobbie, James M. ;
Stuart-Smith, Jane .
INNOVATION IN LANGUAGE LEARNING AND TEACHING, 2018, 12 (03) :212-220
[16]  
Nguyen V. H., 2010, P 6 INT WORKSHOP SEC, P1, DOI DOI 10.1145/1853919.1853923
[17]   A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training [J].
Qian, Xiaojun ;
Meng, Helen ;
Soong, Frank .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) :1020-1028
[18]  
Qian XJ, 2012, 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, P774
[19]  
Qian XJ, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, P757
[20]  
Samsudin NSB, 2017, TENCON IEEE REGION, P1778, DOI 10.1109/TENCON.2017.8228146