Self-supervised Learning and Masked Language Model for Code-switching Automatic Speech Recognition

被引：0

作者：

Chen, Po-Kai ^{[1
]}

Fu, Li-Yeh ^{[2
]}

Chen, Cheng-Kai ^{[1
]}

Lin, Yi-Xing ^{[1
]}

Chen, Chih-Ping ^{[1
]}

Huang, Chien-Lin ^{[3
]}

Wang, Jia-Ching ^{[1
]}

机构：

[1] Natl Cent Univ, Dept CSIE, Taoyuan, Taiwan

[2] Realtek Semicond Corp, Hsinchu, Taiwan

[3] Natl Cheng Kung Univ, Dept CSIE, Tainan, Taiwan

来源：

2024 IEEE TENTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, ICCE 2024 | 2024年

关键词：

code-switching; speech recognition; self-supervised learning; masked language modeling;

D O I：

10.1109/ICCE62051.2024.10634607

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Code-switching (CS) is a common linguistic phenomenon that poses significant challenges for automatic speech recognition systems due to the lack of corpus. In this paper, we propose a novel approach to address this challenge by leveraging self-supervised learning (SSL) and the masked language model (MLM) in speech recognition. Specifically, we use the wav2vec 2.0 pre-trained model to reduce the dependency on CS labeled data, and the MLM to rerank sentences generated using beam search decoding. Our proposed method is evaluated on the SEAME dataset, and experimental results show that it outperforms state-of-the-art CS speech recognition approaches by 15.6% and 19.9% in terms of token error rates (TER). Moreover, the proposed method is generalizable and can be extended to other CS languages. These results demonstrate the effectiveness of our approach and its potential for future research in the field of CS speech recognition.

引用

页码：387 / 391

页数：5

共 50 条

[1] Learning Adapters for Code-Switching Speech Recognition
He, Chun-Yi
Chien, Jen-Tzung
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 344 - 349
[2] DECOUPLING PRONUNCIATION AND LANGUAGE FOR END-TO-END CODE-SWITCHING AUTOMATIC SPEECH RECOGNITION
Zhang, Shuai
Yi, Jiangyan
Tian, Zhengkun
Bai, Ye
Tao, Jianhua
Wen, Zhengqi
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6249 - 6253
[3] Consistency self-supervised learning method for robust automatic speech recognition
Gao, Changfeng
Cheng, Gaofeng
Zhang, Pengyuan
Shengxue Xuebao/Acta Acustica, 2023, 48 (03): : 578 - 587
[4] Code-Switching in Automatic Speech Recognition: The Issues and Future Directions
Mustafa, Mumtaz Begum
Yusoof, Mansoor Ali
Khalaf, Hasan Kahtan
Abushariah, Ahmad Abdel Rahman Mahmoud
Kiah, Miss Laiha Mat
Hua Nong Ting
Muthaiyah, Saravanan
APPLIED SCIENCES-BASEL, 2022, 12 (19):
[5] BENCHMARKING EVALUATION METRICS FOR CODE-SWITCHING AUTOMATIC SPEECH RECOGNITION
Hamed, Injy
Hussein, Amir
Chellah, Oumnia
Chowdhury, Shammur
Mubarak, Hamdy
Sitaram, Sunayana
Habash, Nizar
Ali, Ahmed
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 999 - 1005
[6] Language-specific Characteristic Assistance for Code-switching Speech Recognition
Song, Tongtong
Xu, Qiang
Ge, Meng
Wang, Longbiao
Shi, Hao
Lv, Yongjie
Lin, Yuqin
Dang, Jianwu
INTERSPEECH 2022, 2022, : 3924 - 3928
[7] Semi-supervised acoustic model training for speech with code-switching
Yilmaz, Emre
McLaren, Mitchell
van den Heuvel, Henk
van Leeuwen, David A.
SPEECH COMMUNICATION, 2018, 105 : 12 - 22
[8] Domain Adaptive Self-supervised Training of Automatic Speech Recognition
Do, Cong-Thanh
Doddipatla, Rama
Li, Mohan
Hain, Thomas
INTERSPEECH 2023, 2023, : 4389 - 4393
[9] OTF: Optimal Transport based Fusion of Supervised and Self-Supervised Learning Models for Automatic Speech Recognition
Fu, Li
Li, Siqi
Li, Qingtao
Li, Fangzhu
Deng, Liping
Fan, Lu
Chen, Meng
Wu, Youzheng
He, Xiaodong
INTERSPEECH 2023, 2023, : 934 - 938
[10] Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition
Lee, Damheo
Kim, Donghyun
Yun, Seung
Kim, Sanghun
APPLIED SCIENCES-BASEL, 2021, 11 (06):

← 1 2 3 4 5 →