Cross-Lingual and Ensemble MLPs Strategies for Low-Resource Speech Recognition

被引：0

作者：

Qian, Yanmin ^{[1
]}

Liu, Jia ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China

来源：

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年

关键词：

low-resource language; cross-lingual posterior features; hierarchical architectures; ensemble system;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently there has been some interest in the question of how to build LVCSR systems for the low-resource languages. The scenario we focus on here is having only one hour of acoustic training data in the "target" language, but more plentiful data in other languages. This paper presents approaches using MLP based features: we construct a low-resource system with additional sources of information from the non-target languages to train the cross-lingual MLPs. A hierarchical architecture and multi-stream strategy are applied on the cross-lingual phone level, to improve the neural network more discriminatively. Additionally, an elaborate ensemble system with various acoustic feature streams and context expansion lengths is proposed. After system combination with these two strategies we get significant improvements of more than 8% absolute versus a conventional baseline in this low-resource scenario with only one hour of target training data.

引用

页码：2581 / 2584

页数：4

共 50 条

[1] Cross-Lingual Language Modeling for Low-Resource Speech Recognition
Xu, Ping
Fung, Pascale
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1134 - 1144
[2] Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition
Hou, Wenxin
Zhu, Han
Wang, Yidong
Wang, Jindong
Qin, Tao
Xu, Renju
Shinozaki, Takahiro
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 317 - 329
[3] Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition
Lu, Liang
Ghoshal, Arnab
Renals, Steve
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 17 - 27
[4] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
[5] CAM: A cross-lingual adaptation framework for low-resource language speech recognition
Hu, Qing
Zhang, Yan
Zhang, Xianlei
Han, Zongyu
Yu, Xilong
INFORMATION FUSION, 2024, 111
[6] SUBSPACE MIXTURE MODEL FOR LOW-RESOURCE SPEECH RECOGNITION IN CROSS-LINGUAL SETTINGS
Miao, Yajie
Metze, Florian
Waibel, Alex
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7339 - 7343
[7] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
[8] CROSS-LINGUAL TRANSFER LEARNING FOR LOW-RESOURCE SPEECH TRANSLATION
Khurana, Sameer
Dawalatabad, Nauman
Laurent, Antoine
Vicente, Luis
Gimeno, Pablo
Mingote, Victoria
Glass, James
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 670 - 674
[9] Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition
Farooq, Muhammad Umar
Hain, Thomas
INTERSPEECH 2023, 2023, : 5072 - 5076
[10] Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition
Cahyawijaya, Samuel
Lovenia, Holy
Chung, Willy
Frieske, Rita
Liu, Zihan
Fung, Pascale
INTERSPEECH 2023, 2023, : 3352 - 3356

← 1 2 3 4 5 →