Cross-Lingual and Ensemble MLPs Strategies for Low-Resource Speech Recognition

被引:0
|
作者
Qian, Yanmin [1 ]
Liu, Jia [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
关键词
low-resource language; cross-lingual posterior features; hierarchical architectures; ensemble system;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently there has been some interest in the question of how to build LVCSR systems for the low-resource languages. The scenario we focus on here is having only one hour of acoustic training data in the "target" language, but more plentiful data in other languages. This paper presents approaches using MLP based features: we construct a low-resource system with additional sources of information from the non-target languages to train the cross-lingual MLPs. A hierarchical architecture and multi-stream strategy are applied on the cross-lingual phone level, to improve the neural network more discriminatively. Additionally, an elaborate ensemble system with various acoustic feature streams and context expansion lengths is proposed. After system combination with these two strategies we get significant improvements of more than 8% absolute versus a conventional baseline in this low-resource scenario with only one hour of target training data.
引用
收藏
页码:2581 / 2584
页数:4
相关论文
共 50 条
  • [1] Cross-Lingual Language Modeling for Low-Resource Speech Recognition
    Xu, Ping
    Fung, Pascale
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1134 - 1144
  • [2] Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition
    Hou, Wenxin
    Zhu, Han
    Wang, Yidong
    Wang, Jindong
    Qin, Tao
    Xu, Renju
    Shinozaki, Takahiro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 317 - 329
  • [3] Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 17 - 27
  • [4] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
    1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
  • [5] CAM: A cross-lingual adaptation framework for low-resource language speech recognition
    Hu, Qing
    Zhang, Yan
    Zhang, Xianlei
    Han, Zongyu
    Yu, Xilong
    INFORMATION FUSION, 2024, 111
  • [6] SUBSPACE MIXTURE MODEL FOR LOW-RESOURCE SPEECH RECOGNITION IN CROSS-LINGUAL SETTINGS
    Miao, Yajie
    Metze, Florian
    Waibel, Alex
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7339 - 7343
  • [7] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
    1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
  • [8] CROSS-LINGUAL TRANSFER LEARNING FOR LOW-RESOURCE SPEECH TRANSLATION
    Khurana, Sameer
    Dawalatabad, Nauman
    Laurent, Antoine
    Vicente, Luis
    Gimeno, Pablo
    Mingote, Victoria
    Glass, James
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 670 - 674
  • [9] Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition
    Farooq, Muhammad Umar
    Hain, Thomas
    INTERSPEECH 2023, 2023, : 5072 - 5076
  • [10] Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition
    Cahyawijaya, Samuel
    Lovenia, Holy
    Chung, Willy
    Frieske, Rita
    Liu, Zihan
    Fung, Pascale
    INTERSPEECH 2023, 2023, : 3352 - 3356