AUTOMATIC PREDICTION OF INTELLIGIBILITY OF WORDS AND PHONEMES PRODUCED ORALLY BY JAPANESE LEARNERS OF ENGLISH

被引:3
作者
Zhu, Chuanbo [1 ]
Kunihara, Takuya [1 ]
Saito, Daisuke [1 ]
Minematsu, Nobuaki [1 ]
Nakanishi, Noriko [2 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Tokyo, Japan
[2] Kobe Gakuin Univ, Fac Global Commun, Kobe, Hyogo, Japan
来源
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT | 2022年
关键词
L2; speech; intelligibility; listener diversity; shadowing; machine learning; artificial neural network;
D O I
10.1109/SLT54892.2023.10023307
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The practical goal for language learning is smooth communication with others, and many teachers have a strong focus on measurement of not accentedness but intelligibility, often regarded as correctness of actual understanding. However, automatic prediction of intelligibility has not been well developed especially for smaller units such as words and phonemes. This is mainly because of difficulty of measuring while-listening behaviors of listeners, and thus it was difficult to build an L2 speech corpus of a sufficient size with intelligibility annotation to train a network-based predictor. In this paper, we annotate intelligibility using oral dictation with a small delay, i.e., shadowing, to collect a large enough corpus from two raters with different language backgrounds. Since perceived intelligibility depends on their language background, inter-rater difference should be taken into account. Therefore with this corpus, a multi-rater neural model is built to predict each rater's intelligibility of the individual words and phonemes in L2 speech. Two tasks are examined, i.e., regression of intelligibility scores and classification of a given segment to be intelligible or not. Results show that our model has higher F1 scores than intra-rater agreements, indicating that our model can simulate the two raters accurately well although they have different language background.
引用
收藏
页码:1029 / 1036
页数:8
相关论文
共 31 条
[1]  
amazon, AM POLL
[2]  
Anckar Joanna, 2019, FLUENCY L2 LEARNING, P49
[3]  
[Anonymous], 2011, Tech. Rep.
[4]  
Chen L., 2018, AUTOMATED SCORING NO, V2018, P1, DOI [10.1002/ets2.12198, DOI 10.1002/ETS2.12198, 10.1002]
[5]  
Davies M., 2011, English n-grams based on data from the COCA corpus
[6]  
Derwing T.M., 1997, Studies in Second Language Acquisition, V19, P1, DOI DOI 10.1017/S0272263197001010
[7]  
Derwing T.M., 2015, PRONUNCIATION FUNDAM, DOI [10.1075/lllt.42, DOI 10.1075/LLLT.42]
[8]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, DOI 10.48550/ARXIV.1810.04805]
[9]  
Duris Mahdi., 2021, PROC PSLLT
[10]  
github, GRAMF