An Alignment Method Leveraging Articulatory Features for Mispronunciation Detection and Diagnosis in L2 English

被引:2
|
作者
Chen, Qi [1 ]
Lin, Binghuai [2 ]
Xie, Yanlu [1 ]
机构
[1] Beijing Language & Culture Univ, Beijing, Peoples R China
[2] Tencent Technol Co Ltd, Smart Platform Prod Dept, Shenzhen, Peoples R China
来源
INTERSPEECH 2022 | 2022年
基金
中央高校基本科研业务费专项资金资助;
关键词
computer-aided pronunciation training(CAPT); mispronunciation detection and diagnosis(MD&D); articulatory features; alignment;
D O I
10.21437/Interspeech.2022-10309
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Mispronunciation Detection and Diagnosis (MD&D) technology is used for detecting mispronunciations and providing feedback. Most MD&D systems are based on phoneme recognition. However, few studies have made use of the predefined reference text which has been provided to second language (L2) learners while practicing pronunciation. In this paper, we propose a novel alignment method based on linguistic knowledge of articulatory manner and places to align the phone sequences of the reference text with L2 learners speech. After getting the alignment results, we concatenate the corresponding phoneme embedding and the acoustic features of each speech frame as input. This method makes reasonable use of the reference text information as extra input. Experimental results show that the model can implicitly learn valid information in the reference text by this method. Meanwhile, it avoids introducing misleading information in the reference text, which will cause false acceptance (FA). Besides, the method incorporates articulatory features, which helps the model recognize phonemes. We evaluate the method on the L2-ARCTIC dataset and it turns out that our approach improves the F1-score over the state-of-the-art system by 4.9% relative.
引用
收藏
页码:4342 / 4346
页数:5
相关论文
共 50 条
  • [1] INTEGRATING ARTICULATORY FEATURES INTO ACOUSTIC-PHONEMIC MODEL FOR MISPRONUNCIATION DETECTION AND DIAGNOSIS IN L2 ENGLISH SPEECH
    Mao, Shaoguang
    Wu, Zhiyong
    Li, Xu
    Li, Runnan
    Wu, Xixin
    Meng, Helen
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [2] ARTICULATORY COMPARISON OF L1 AND L2 SPEECH FOR MISPRONUNCIATION DIAGNOSIS
    Khanal, Subash
    Johnson, Michael T.
    Bozorg, Narjes
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 693 - 697
  • [3] Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks
    Li, Kun
    Qian, Xiaojun
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 193 - 207
  • [4] UNSUPERVISED DISCOVERY OF AN EXTENDED PHONEME SET IN L2 ENGLISH SPEECH FOR MISPRONUNCIATION DETECTION AND DIAGNOSIS
    Mao, Shaoguang
    Li, Xu
    Li, Kun
    Wu, Zhiyong
    Liu, Xunying
    Meng, Helen
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6244 - 6248
  • [5] TOWARDS ROBUST MISPRONUNCIATION DETECTION AND DIAGNOSIS FOR L2 ENGLISH LEARNERS WITH ACCENT-MODULATING METHODS
    Jiang, Shao-Wei Fan
    Yan, Bi-Cheng
    Lo, Tien-Hong
    Chao, Fu-An
    Chen, Berlin
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1065 - 1070
  • [6] MISPRONUNCIATION DETECTION IN NON-NATIVE (L2) ENGLISH WITH UNCERTAINTY MODELING
    Korzekwa, Daniel
    Lorenzo-Trueba, Jaime
    Zaporowski, Szymon
    Calamaro, Shira
    Drugman, Thomas
    Kostek, Bozena
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7738 - 7742
  • [7] Mispronunciation Detection and Diagnosis in L2 English Speech Using Multi-Distribution Deep Neural Networks
    Li, Kun
    Meng, Helen
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 255 - 259
  • [8] Unsupervised Discovery of Non-native Phonetic Patterns in L2 English Speech for Mispronunciation Detection and Diagnosis
    Li, Xu
    Mao, Shaoguang
    Wu, Xixin
    Li, Kun
    Liu, Xunying
    Meng, Helen
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2554 - 2558
  • [9] An End-to-End Mispronunciation Detection System for L2 English Speech Leveraging Novel Anti-Phone Modeling
    Yan, Bi-Cheng
    Wu, Meng-Che
    Hung, Hsiao-Tsung
    Chen, Berlin
    INTERSPEECH 2020, 2020, : 3032 - 3036
  • [10] APPLYING MULTITASK LEARNING TO ACOUSTIC-PHONEMIC MODEL FOR MISPRONUNCIATION DETECTION AND DIAGNOSIS IN L2 ENGLISH SPEECH
    Mao, Shaoguang
    Wu, Zhiyong
    Li, Runnan
    Li, Xu
    Meng, Helen
    Cai, Lianhong
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6254 - 6258