NON-AUTOREGRESSIVE MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION

被引:7
|
作者
Chuang, Shun-Po [1 ]
Chang, Heng-Jui [1 ]
Huang, Sung-Feng [1 ]
Lee, Hung-yi [1 ]
机构
[1] Natl Taiwan Univ, Coll Elect Engn & Comp Sci, Taipei, Taiwan
来源
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2021年
关键词
non-autoregressive; code-switching; end-to-end speech recognition;
D O I
10.1109/ASRU51503.2021.9688174
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mandarin-English code-switching (CS) is frequently used among East and Southeast Asian people. However, the intra-sentence language switching of the two very different languages makes recognizing CS speech challenging. Meanwhile, the recent successful non-autoregressive (NAR) ASR models remove the need for left-to-right beam decoding in autoregressive (AR) models and achieved outstanding performance and fast inference speed, but it has not been applied to Mandarin-English CS speech recognition. This paper takes advantage of the Mask-CTC NAR ASR framework to tackle the CS speech recognition issue. We further propose to change the Mandarin output target of the encoder to Pinyin for faster encoder training and introduce the Pinyin-to-Mandarin decoder to learn contextualized information. Moreover, we use word embedding label smoothing to regularize the decoder with contextualized information and projection matrix regularization to bridge that gap between the encoder and decoder. We evaluate these methods on the SEAME corpus and achieved exciting results.
引用
收藏
页码:465 / 472
页数:8
相关论文
共 50 条
  • [21] TEXTUAL DATA AUGMENTATION FOR ARABIC-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Hussein, Amir
    Chowdhury, Shammur Absar
    Abdelali, Ahmed
    Dehak, Najim
    Ali, Ahmed
    Khudanpur, Sanjeev
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 777 - 784
  • [22] Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech
    Gupta, Shashi Kant
    Hiray, Sushant
    Kukde, Prashant
    INTERSPEECH 2023, 2023, : 4114 - 4118
  • [23] DATA AUGMENTATION FOR END-TO-END CODE-SWITCHING SPEECH RECOGNITION
    Du, Chenpeng
    Li, Hao
    Lu, Yizhou
    Wang, Lan
    Qian, Yanmin
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 194 - 200
  • [24] Improving End-to-End Modeling For Mandarin-English Code-Switching Using Lightweight Switch-Routing Mixture-of-Experts
    Tan, Fengyun
    Feng, Chaofeng
    Wei, Tao
    Gong, Shuai
    Leng, Jinqiang
    Chu, Wei
    Ma, Jun
    Wang, Shaojun
    Xiao, Jing
    INTERSPEECH 2023, 2023, : 4224 - 4228
  • [25] Language choice and code-switching in bilingual children's interaction under multilingual contexts: evidence from Mandarin-English bilingual preschoolers
    Zhang, Haijing
    Huang, Fangwei
    Wang, Cong
    INTERNATIONAL JOURNAL OF MULTILINGUALISM, 2025, 22 (02) : 860 - 884
  • [26] Acoustic modeling for Thai-English code-switching speech
    Chunwijitra, Vataya
    Thatphithakkul, Sumonmas
    Chootrakool, Patcharika
    Kasuriya, Sawit
    PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 94 - 99
  • [27] JAPANESE-ENGLISH CODE-SWITCHING SPEECH DATA CONSTRUCTION
    Nakayama, Sahoko
    Kano, Takatomo
    Quoc Truong Do
    Sakti, Sakriani
    Nakamura, Satoshi
    2018 ORIENTAL COCOSDA - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2018, : 67 - 71
  • [28] Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition
    Lee, Damheo
    Kim, Donghyun
    Yun, Seung
    Kim, Sanghun
    APPLIED SCIENCES-BASEL, 2021, 11 (06):
  • [29] Investigations on speech recognition systems for low-resource dialectal Arabic-English code-switching speech
    Hamed, Injy
    Denisov, Pavel
    Li, Chia-Yu
    Elmahdy, Mohamed
    Abdennadher, Slim
    Ngoc Thang Vu
    COMPUTER SPEECH AND LANGUAGE, 2022, 72
  • [30] Arabic Code-Switching Speech Recognition using Monolingual Data
    Ali, Ahmed
    Chowdhur, Shammur
    Hussein, Amir
    Hifny, Yasser
    INTERSPEECH 2021, 2021, : 3475 - 3479