Self-supervised Learning and Masked Language Model for Code-switching Automatic Speech Recognition

被引:0
|
作者
Chen, Po-Kai [1 ]
Fu, Li-Yeh [2 ]
Chen, Cheng-Kai [1 ]
Lin, Yi-Xing [1 ]
Chen, Chih-Ping [1 ]
Huang, Chien-Lin [3 ]
Wang, Jia-Ching [1 ]
机构
[1] Natl Cent Univ, Dept CSIE, Taoyuan, Taiwan
[2] Realtek Semicond Corp, Hsinchu, Taiwan
[3] Natl Cheng Kung Univ, Dept CSIE, Tainan, Taiwan
来源
2024 IEEE TENTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, ICCE 2024 | 2024年
关键词
code-switching; speech recognition; self-supervised learning; masked language modeling;
D O I
10.1109/ICCE62051.2024.10634607
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Code-switching (CS) is a common linguistic phenomenon that poses significant challenges for automatic speech recognition systems due to the lack of corpus. In this paper, we propose a novel approach to address this challenge by leveraging self-supervised learning (SSL) and the masked language model (MLM) in speech recognition. Specifically, we use the wav2vec 2.0 pre-trained model to reduce the dependency on CS labeled data, and the MLM to rerank sentences generated using beam search decoding. Our proposed method is evaluated on the SEAME dataset, and experimental results show that it outperforms state-of-the-art CS speech recognition approaches by 15.6% and 19.9% in terms of token error rates (TER). Moreover, the proposed method is generalizable and can be extended to other CS languages. These results demonstrate the effectiveness of our approach and its potential for future research in the field of CS speech recognition.
引用
收藏
页码:387 / 391
页数:5
相关论文
共 50 条
  • [1] Learning Adapters for Code-Switching Speech Recognition
    He, Chun-Yi
    Chien, Jen-Tzung
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 344 - 349
  • [2] DECOUPLING PRONUNCIATION AND LANGUAGE FOR END-TO-END CODE-SWITCHING AUTOMATIC SPEECH RECOGNITION
    Zhang, Shuai
    Yi, Jiangyan
    Tian, Zhengkun
    Bai, Ye
    Tao, Jianhua
    Wen, Zhengqi
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6249 - 6253
  • [3] Consistency self-supervised learning method for robust automatic speech recognition
    Gao, Changfeng
    Cheng, Gaofeng
    Zhang, Pengyuan
    Shengxue Xuebao/Acta Acustica, 2023, 48 (03): : 578 - 587
  • [4] Code-Switching in Automatic Speech Recognition: The Issues and Future Directions
    Mustafa, Mumtaz Begum
    Yusoof, Mansoor Ali
    Khalaf, Hasan Kahtan
    Abushariah, Ahmad Abdel Rahman Mahmoud
    Kiah, Miss Laiha Mat
    Hua Nong Ting
    Muthaiyah, Saravanan
    APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [5] BENCHMARKING EVALUATION METRICS FOR CODE-SWITCHING AUTOMATIC SPEECH RECOGNITION
    Hamed, Injy
    Hussein, Amir
    Chellah, Oumnia
    Chowdhury, Shammur
    Mubarak, Hamdy
    Sitaram, Sunayana
    Habash, Nizar
    Ali, Ahmed
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 999 - 1005
  • [6] Language-specific Characteristic Assistance for Code-switching Speech Recognition
    Song, Tongtong
    Xu, Qiang
    Ge, Meng
    Wang, Longbiao
    Shi, Hao
    Lv, Yongjie
    Lin, Yuqin
    Dang, Jianwu
    INTERSPEECH 2022, 2022, : 3924 - 3928
  • [7] Semi-supervised acoustic model training for speech with code-switching
    Yilmaz, Emre
    McLaren, Mitchell
    van den Heuvel, Henk
    van Leeuwen, David A.
    SPEECH COMMUNICATION, 2018, 105 : 12 - 22
  • [8] Domain Adaptive Self-supervised Training of Automatic Speech Recognition
    Do, Cong-Thanh
    Doddipatla, Rama
    Li, Mohan
    Hain, Thomas
    INTERSPEECH 2023, 2023, : 4389 - 4393
  • [9] OTF: Optimal Transport based Fusion of Supervised and Self-Supervised Learning Models for Automatic Speech Recognition
    Fu, Li
    Li, Siqi
    Li, Qingtao
    Li, Fangzhu
    Deng, Liping
    Fan, Lu
    Chen, Meng
    Wu, Youzheng
    He, Xiaodong
    INTERSPEECH 2023, 2023, : 934 - 938
  • [10] Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition
    Lee, Damheo
    Kim, Donghyun
    Yun, Seung
    Kim, Sanghun
    APPLIED SCIENCES-BASEL, 2021, 11 (06):