Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition

被引:33
|
作者
Hou, Wenxin [1 ,2 ]
Zhu, Han [3 ]
Wang, Yidong [1 ]
Wang, Jindong [4 ]
Qin, Tao [4 ]
Xu, Renju [5 ]
Shinozaki, Takahiro [1 ]
机构
[1] Tokyo Inst Technol, Tokyo 1528550, Japan
[2] Microsoft, Suzhou 215123, Peoples R China
[3] Chinese Acad Sci, Inst Acoust, Beijing 100045, Peoples R China
[4] Microsoft Res Asia, Beijing 100080, Peoples R China
[5] Zhejiang Univ, Ctr Data Sci, Hangzhou 310027, Peoples R China
关键词
Adaptation models; Task analysis; Speech recognition; Transformers; Training; Training data; Data models; cross-lingual adaptation; meta-learning; parameter-efficiency;
D O I
10.1109/TASLP.2021.3138674
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language. Since the low-resource language has limited training data, speech recognition models can easily overfit. Adapter is a versatile module that can be plugged into Transformer for parameter-efficient learning. In this paper, we propose to use adapters for parameter-efficient cross-lingual speech adaptation. Based on our previous MetaAdapter that implicitly leverages adapters, we propose a novel algorithm called SimAdapter for explicitly learning knowledge from adapters. Our algorithms can be easily integrated into the Transformer structure. MetaAdapter leverages meta-learning to transfer the general knowledge from training data to the test language. SimAdapter aims to learn the similarities between the source and target languages during fine-tuning using the adapters. We conduct extensive experiments on five-low-resource languages in the Common Voice dataset. Results demonstrate that MetaAdapter and SimAdapter can reduce WER by 2.98% and 2.55% with only 2.5% and 15.5% of trainable parameters compared to the strong full-model fine-tuning baseline. Moreover, we show that these two novel algorithms can be integrated for better performance with up to 3.55% relative WER reduction.
引用
收藏
页码:317 / 329
页数:13
相关论文
共 50 条
  • [21] Unsupervised Cross-Lingual Part-of-Speech Tagging for Truly Low-Resource Scenarios
    Eskander, Ramy
    Muresan, Smaranda
    Collins, Michael
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4820 - 4831
  • [22] Language fusion via adapters for low-resource speech recognition
    Hu, Qing
    Zhang, Yan
    Zhang, Xianlei
    Han, Zongyu
    Liang, Xiuxia
    SPEECH COMMUNICATION, 2024, 158
  • [23] AfriWOZ: Corpus for Exploiting Cross-Lingual Transfer for Dialogue Generation in Low-Resource, African Languages
    Adewumi, Tosin
    Adeyemi, Mofetoluwa
    Anuoluwapo, Aremu
    Peters, Bukola
    Buzaaba, Happy
    Samuel, Oyerinde
    Rufai, Amina Mardiyyah
    Ajibade, Benjamin
    Gwadabe, Tajudeen
    Traore, Mory Moussou Koulibaly
    Ajayi, Tunde Oluwaseyi
    Muhammad, Shamsuddeen
    Baruwa, Ahmed
    Owoicho, Paul
    Ogunremi, Tolulope
    Ngigi, Phylis
    Ahia, Orevaoghene
    Nasir, Ruqayya
    Liwicki, Foteini
    Liwicki, Marcus
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [24] Design Challenges in Low-resource Cross-lingual Entity Linking
    Fu, Xingyu
    Shi, Weijia
    Yu, Xiaodong
    Zhao, Zian
    Roth, Dan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6418 - 6432
  • [25] CROSS-LINGUAL SPEECH RECOGNITION UNDER RUNTIME RESOURCE CONSTRAINTS
    Yu, Dong
    Deng, Li
    Liu, Peng
    Wu, Jian
    Gong, Yifan
    Acero, Alex
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4193 - 4196
  • [26] Cross-Lingual Word Embeddings for Low-Resource Language Modeling
    Adams, Oliver
    Makarucha, Adam
    Neubig, Graham
    Bird, Steven
    Cohn, Trevor
    15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 937 - 947
  • [27] Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages
    Nie, Ercong
    Liang, Sheng
    Schmid, Helmut
    Schuetze, Hinrich
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8320 - 8340
  • [28] Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin
    Lin, Pin-Jie
    Saeed, Muhammed
    Chang, Ernie
    Scholman, Merel
    INTERSPEECH 2023, 2023, : 3954 - 3958
  • [29] UniSplice: Universal Cross-Lingual Data Splicing for Low-Resource ASR
    Wang, Wei
    Qian, Yanmin
    INTERSPEECH 2023, 2023, : 2253 - 2257
  • [30] Improving Candidate Generation for Low-resource Cross-lingual Entity Linking
    Zhou, Shuyan
    Rijhwani, Shruti
    Wieting, John
    Carbonell, Jaime
    Neubig, Graham
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 109 - 124