AUTOMATIC PREDICTION OF INTELLIGIBILITY OF WORDS AND PHONEMES PRODUCED ORALLY BY JAPANESE LEARNERS OF ENGLISH

被引：3

作者：

Zhu, Chuanbo ^{[1
]}

Kunihara, Takuya ^{[1
]}

Saito, Daisuke ^{[1
]}

Minematsu, Nobuaki ^{[1
]}

Nakanishi, Noriko ^{[2
]}

机构：

[1] Univ Tokyo, Grad Sch Engn, Tokyo, Japan

[2] Kobe Gakuin Univ, Fac Global Commun, Kobe, Hyogo, Japan

来源：

2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT | 2022年

关键词：

L2; speech; intelligibility; listener diversity; shadowing; machine learning; artificial neural network;

D O I：

10.1109/SLT54892.2023.10023307

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The practical goal for language learning is smooth communication with others, and many teachers have a strong focus on measurement of not accentedness but intelligibility, often regarded as correctness of actual understanding. However, automatic prediction of intelligibility has not been well developed especially for smaller units such as words and phonemes. This is mainly because of difficulty of measuring while-listening behaviors of listeners, and thus it was difficult to build an L2 speech corpus of a sufficient size with intelligibility annotation to train a network-based predictor. In this paper, we annotate intelligibility using oral dictation with a small delay, i.e., shadowing, to collect a large enough corpus from two raters with different language backgrounds. Since perceived intelligibility depends on their language background, inter-rater difference should be taken into account. Therefore with this corpus, a multi-rater neural model is built to predict each rater's intelligibility of the individual words and phonemes in L2 speech. Two tasks are examined, i.e., regression of intelligibility scores and classification of a given segment to be intelligible or not. Results show that our model has higher F1 scores than intra-rater agreements, indicating that our model can simulate the two raters accurately well although they have different language background.

引用

页码：1029 / 1036

页数：8

共 31 条

[1]

amazon, AM POLL

[2]

Anckar Joanna, 2019, FLUENCY L2 LEARNING, P49

[3]

[Anonymous], 2011, Tech. Rep.

[4]

Chen L., 2018, AUTOMATED SCORING NO, V2018, P1, DOI [10.1002/ets2.12198, DOI 10.1002/ETS2.12198, 10.1002]

[5]

Davies M., 2011, English n-grams based on data from the COCA corpus

[6]

Derwing T.M., 1997, Studies in Second Language Acquisition, V19, P1, DOI DOI 10.1017/S0272263197001010

[7]

Derwing T.M., 2015, PRONUNCIATION FUNDAM, DOI [10.1075/lllt.42, DOI 10.1075/LLLT.42]

[8]

Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, DOI 10.48550/ARXIV.1810.04805]

[9]

Duris Mahdi., 2021, PROC PSLLT

[10]

github, GRAMF

← 1 2 3 4 →