Acoustic data augmentation for Mandarin-English code-switching speech recognition

被引：20

作者：

Long, Yanhua ^{[1
]}

Li, Yijie ^{[2
]}

Zhang, Qiaozheng ^{[1
]}

Wei, Shuang ^{[1
]}

Ye, Hong ^{[1
]}

Yang, Jichen ^{[3
]}

机构：

[1] Shanghai Normal Univ, SHNU Unisound Joint Lab Nat Human Comp Interact, Shanghai, Peoples R China

[2] Unisound AI Technol Co Ltd, Beijing, Peoples R China

[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore

来源：

APPLIED ACOUSTICS | 2020年 / 161卷

基金：

中国国家自然科学基金;

关键词：

Data augmentation; Code-switching; Acoustic event detection; Speech recognition; NEURAL-NETWORKS;

D O I：

10.1016/j.apacoust.2019.107175

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Code-switching (CS) is a multilingual phenomenon where a speaker uses different languages in an utterance or between alternating utterances. Developing large-scale datasets for training code-switching acoustic and language models is challenging and extremely expensive. In this paper, we focus on the acoustic data augmentation for the Mandarin-English CS speech recognition task. Effectiveness of conventional acoustic data augmentation approaches are examined. More importantly, we propose a CS acoustic event detection system based on the deep neural network to extract real code-switching speech segments automatically. Then, the semi-supervised and active learning techniques are investigated to generate transcriptions of these segments. Finally, code-switching speech synthesis system is introduced to further enhance the acoustic modeling. Experimental results on the OC16-CE80 data, a Mandarin English mixlingual speech corpus, demonstrate the effectiveness of the proposed methods. (C) 2019 Elsevier Ltd. All rights reserved.

引用

页数：7

共 50 条

[1] Pronunciation augmentation for Mandarin-English code-switching speech recognition
Long, Yanhua
Wei, Shuang
Lian, Jie
Li, Yijie
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
[2] Pronunciation augmentation for Mandarin-English code-switching speech recognition
Yanhua Long
Shuang Wei
Jie Lian
Yijie Li
EURASIP Journal on Audio, Speech, and Music Processing, 2021
[3] ADDRESSING ACCENT MISMATCH IN MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
Tan, Zhili
Fan, Xinghua
Zhu, Hui
Lin, Ed
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8259 - 8263
[4] Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition
Fan, Zhiyun
Dong, Linhao
Shen, Chen
Liang, Zhenlin
Zhang, Jun
Lu, Lu
Ma, Zejun
INTERSPEECH 2023, 2023, : 3322 - 3326
[5] On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
Zeng, Zhiping
Khassanov, Yerbolat
Van Tung Pham
Xu, Haihua
Chng, Eng Siong
Li, Haizhou
INTERSPEECH 2019, 2019, : 2165 - 2169
[6] NON-AUTOREGRESSIVE MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
Chuang, Shun-Po
Chang, Heng-Jui
Huang, Sung-Feng
Lee, Hung-yi
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 465 - 472
[7] Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition
Nga, Cao Hong
Vu, Duc-Quang
Luong, Huong Hoang
Huang, Chien-Lin
Wang, Jia-Ching
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1387 - 1391
[8] A Mandarin-English Code-Switching Corpus
Li, Ying
Yu, Yue
Fung, Pascale
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2515 - 2519
[9] INVESTIGATING END-TO-END SPEECH RECOGNITION FOR MANDARIN-ENGLISH CODE-SWITCHING
Shan, Changhao
Weng, Chao
Wang, Guangsen
Su, Dan
Luo, Min
Yu, Dong
Xie, Lei
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6056 - 6060
[10] TEXTUAL DATA AUGMENTATION FOR ARABIC-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
Hussein, Amir
Chowdhury, Shammur Absar
Abdelali, Ahmed
Dehak, Najim
Ali, Ahmed
Khudanpur, Sanjeev
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 777 - 784

← 1 2 3 4 5 →