NON-AUTOREGRESSIVE MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION

被引:7
|
作者
Chuang, Shun-Po [1 ]
Chang, Heng-Jui [1 ]
Huang, Sung-Feng [1 ]
Lee, Hung-yi [1 ]
机构
[1] Natl Taiwan Univ, Coll Elect Engn & Comp Sci, Taipei, Taiwan
来源
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2021年
关键词
non-autoregressive; code-switching; end-to-end speech recognition;
D O I
10.1109/ASRU51503.2021.9688174
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mandarin-English code-switching (CS) is frequently used among East and Southeast Asian people. However, the intra-sentence language switching of the two very different languages makes recognizing CS speech challenging. Meanwhile, the recent successful non-autoregressive (NAR) ASR models remove the need for left-to-right beam decoding in autoregressive (AR) models and achieved outstanding performance and fast inference speed, but it has not been applied to Mandarin-English CS speech recognition. This paper takes advantage of the Mask-CTC NAR ASR framework to tackle the CS speech recognition issue. We further propose to change the Mandarin output target of the encoder to Pinyin for faster encoder training and introduce the Pinyin-to-Mandarin decoder to learn contextualized information. Moreover, we use word embedding label smoothing to regularize the decoder with contextualized information and projection matrix regularization to bridge that gap between the encoder and decoder. We evaluate these methods on the SEAME corpus and achieved exciting results.
引用
收藏
页码:465 / 472
页数:8
相关论文
共 50 条
  • [41] MERLIon CCS Challenge: A English-Mandarin code-switching child-directed speech corpus for language identification and diarization
    Chua, Victoria Y. H.
    Liu, Hexin
    Perera, Leibny Paola Garcia
    Woon, Fei Ting
    Wong, Jinyi
    Zhang, Xiangyu
    Khudanpur, Sanjeev
    Khong, Andy W. H.
    Dauwels, Justin
    Styles, Suzy J.
    INTERSPEECH 2023, 2023, : 4109 - 4113
  • [42] Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation
    Li, Song
    Luo, Haoneng
    Hu, Wenxuan
    Liu, Yuan
    Zhang, Shiliang
    Li, Lin
    Hong, Qingyang
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 16 - 20
  • [43] CanVEC - the Canberra Vietnamese-English Code-switching Natural Speech Corpus
    Li Nguyen
    Bryant, Christopher
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4121 - 4129
  • [45] CodeFed: Federated Speech Recognition for Low-Resource Code-Switching Detection
    Madan, Chetan
    Diddee, Harshita
    Kumar, Deepika
    Mittal, Mamta
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (01)
  • [46] Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition
    Tian, Zhengkun
    Yi, Jiangyan
    Tao, Jianhua
    Bai, Ye
    Zhang, Shuai
    Wen, Zhengqi
    INTERSPEECH 2020, 2020, : 5026 - 5030
  • [47] Direct Speech in the context of discussion on code-switching
    Barciela, Lois Xacobe Atanes
    ESTUDOS DE LINGUISTICA GALEGA, 2023, 15
  • [48] The PF Disjunction Theorem to Southern Min/Mandarin code-switching
    Wang, Sung-Lan
    INTERNATIONAL JOURNAL OF BILINGUALISM, 2017, 21 (05) : 541 - 558
  • [49] Gender in Russian-English code-switching
    Chirsheva, Galina
    INTERNATIONAL JOURNAL OF BILINGUALISM, 2009, 13 (01) : 63 - 90
  • [50] A Study of Code-switching in the College English Classroom
    雷春晓
    海外英语, 2015, (02) : 105 - 106