An Alignment Method Leveraging Articulatory Features for Mispronunciation Detection and Diagnosis in L2 English

被引：2

作者：

Chen, Qi ^{[1
]}

Lin, Binghuai ^{[2
]}

Xie, Yanlu ^{[1
]}

机构：

[1] Beijing Language & Culture Univ, Beijing, Peoples R China

[2] Tencent Technol Co Ltd, Smart Platform Prod Dept, Shenzhen, Peoples R China

来源：

INTERSPEECH 2022 | 2022年

基金：

中央高校基本科研业务费专项资金资助;

关键词：

computer-aided pronunciation training(CAPT); mispronunciation detection and diagnosis(MD&D); articulatory features; alignment;

D O I：

10.21437/Interspeech.2022-10309

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Mispronunciation Detection and Diagnosis (MD&D) technology is used for detecting mispronunciations and providing feedback. Most MD&D systems are based on phoneme recognition. However, few studies have made use of the predefined reference text which has been provided to second language (L2) learners while practicing pronunciation. In this paper, we propose a novel alignment method based on linguistic knowledge of articulatory manner and places to align the phone sequences of the reference text with L2 learners speech. After getting the alignment results, we concatenate the corresponding phoneme embedding and the acoustic features of each speech frame as input. This method makes reasonable use of the reference text information as extra input. Experimental results show that the model can implicitly learn valid information in the reference text by this method. Meanwhile, it avoids introducing misleading information in the reference text, which will cause false acceptance (FA). Besides, the method incorporates articulatory features, which helps the model recognize phonemes. We evaluate the method on the L2-ARCTIC dataset and it turns out that our approach improves the F1-score over the state-of-the-art system by 4.9% relative.

引用

页码：4342 / 4346

页数：5