An Alignment Method Leveraging Articulatory Features for Mispronunciation Detection and Diagnosis in L2 English

被引：2

作者：

Chen, Qi ^{[1
]}

Lin, Binghuai ^{[2
]}

Xie, Yanlu ^{[1
]}

机构：

[1] Beijing Language & Culture Univ, Beijing, Peoples R China

[2] Tencent Technol Co Ltd, Smart Platform Prod Dept, Shenzhen, Peoples R China

来源：

INTERSPEECH 2022 | 2022年

基金：

中央高校基本科研业务费专项资金资助;

关键词：

computer-aided pronunciation training(CAPT); mispronunciation detection and diagnosis(MD&D); articulatory features; alignment;

D O I：

10.21437/Interspeech.2022-10309

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Mispronunciation Detection and Diagnosis (MD&D) technology is used for detecting mispronunciations and providing feedback. Most MD&D systems are based on phoneme recognition. However, few studies have made use of the predefined reference text which has been provided to second language (L2) learners while practicing pronunciation. In this paper, we propose a novel alignment method based on linguistic knowledge of articulatory manner and places to align the phone sequences of the reference text with L2 learners speech. After getting the alignment results, we concatenate the corresponding phoneme embedding and the acoustic features of each speech frame as input. This method makes reasonable use of the reference text information as extra input. Experimental results show that the model can implicitly learn valid information in the reference text by this method. Meanwhile, it avoids introducing misleading information in the reference text, which will cause false acceptance (FA). Besides, the method incorporates articulatory features, which helps the model recognize phonemes. We evaluate the method on the L2-ARCTIC dataset and it turns out that our approach improves the F1-score over the state-of-the-art system by 4.9% relative.

引用

页码：4342 / 4346

页数：5

共 50 条

[1] INTEGRATING ARTICULATORY FEATURES INTO ACOUSTIC-PHONEMIC MODEL FOR MISPRONUNCIATION DETECTION AND DIAGNOSIS IN L2 ENGLISH SPEECH
Mao, Shaoguang
Wu, Zhiyong
Li, Xu
Li, Runnan
Wu, Xixin
Meng, Helen
2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
[2] ARTICULATORY COMPARISON OF L1 AND L2 SPEECH FOR MISPRONUNCIATION DIAGNOSIS
Khanal, Subash
Johnson, Michael T.
Bozorg, Narjes
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 693 - 697
[3] Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks
Li, Kun
Qian, Xiaojun
Meng, Helen
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 193 - 207
[4] UNSUPERVISED DISCOVERY OF AN EXTENDED PHONEME SET IN L2 ENGLISH SPEECH FOR MISPRONUNCIATION DETECTION AND DIAGNOSIS
Mao, Shaoguang
Li, Xu
Li, Kun
Wu, Zhiyong
Liu, Xunying
Meng, Helen
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6244 - 6248
[5] TOWARDS ROBUST MISPRONUNCIATION DETECTION AND DIAGNOSIS FOR L2 ENGLISH LEARNERS WITH ACCENT-MODULATING METHODS
Jiang, Shao-Wei Fan
Yan, Bi-Cheng
Lo, Tien-Hong
Chao, Fu-An
Chen, Berlin
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1065 - 1070
[6] MISPRONUNCIATION DETECTION IN NON-NATIVE (L2) ENGLISH WITH UNCERTAINTY MODELING
Korzekwa, Daniel
Lorenzo-Trueba, Jaime
Zaporowski, Szymon
Calamaro, Shira
Drugman, Thomas
Kostek, Bozena
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7738 - 7742
[7] Mispronunciation Detection and Diagnosis in L2 English Speech Using Multi-Distribution Deep Neural Networks
Li, Kun
Meng, Helen
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 255 - 259
[8] Unsupervised Discovery of Non-native Phonetic Patterns in L2 English Speech for Mispronunciation Detection and Diagnosis
Li, Xu
Mao, Shaoguang
Wu, Xixin
Li, Kun
Liu, Xunying
Meng, Helen
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2554 - 2558
[9] An End-to-End Mispronunciation Detection System for L2 English Speech Leveraging Novel Anti-Phone Modeling
Yan, Bi-Cheng
Wu, Meng-Che
Hung, Hsiao-Tsung
Chen, Berlin
INTERSPEECH 2020, 2020, : 3032 - 3036
[10] APPLYING MULTITASK LEARNING TO ACOUSTIC-PHONEMIC MODEL FOR MISPRONUNCIATION DETECTION AND DIAGNOSIS IN L2 ENGLISH SPEECH
Mao, Shaoguang
Wu, Zhiyong
Li, Runnan
Li, Xu
Meng, Helen
Cai, Lianhong
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6254 - 6258

← 1 2 3 4 5 →