Leveraging translations for speech transcription in low-resource settings

被引:0
|
作者
Anastasopoulos, Antonios [1 ]
Chiang, David [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
基金
美国国家科学基金会;
关键词
neural multi-source models; speech transcription; endangered languages;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently proposed data collection frameworks for endangered language documentation aim not only to collect speech in the language of interest, but also to collect translations into a high resource language that will render the collected resource interpretable. We focus on this scenario and explore whether we can improve transcription quality under these extremely low resource settings with the assistance of text translations. We present a neural multi-source model and evaluate several variations of it on three low-resource datasets. We find that our multi-source model with shared attention outperforms the baselines, reducing transcription character error rate by up to 12.3%.
引用
收藏
页码:1279 / 1283
页数:5
相关论文
共 3 条
  • [1] Automatic Speech Transcription for Low-Resource Languages - The Case of Yoloxfochitl Mixtec (Mexico)
    Mitral, Vikramjit
    Katholl, Andreas
    Amith, Jonathan D.
    Castillo Garcia, Rey
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3076 - 3080
  • [2] No data to crawl? Monolingual corpus creation from PDF files of truly low-resource languages in Peru
    Bustamante, Gina
    Oncevay, Arturo
    Zariquiey, Roberto
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2914 - 2923
  • [3] Efficient strategies in course planning for low-resource minority language classes in higher education: observations from Uralic studies and the example of South Estonian
    Weber, Tobias
    LANGUAGE LEARNING JOURNAL, 2020, 48 (03) : 331 - 345