Automatic Speech Transcription for Low-Resource Languages - The Case of Yoloxfochitl Mixtec (Mexico)

被引:4
|
作者
Mitral, Vikramjit [1 ]
Katholl, Andreas [1 ]
Amith, Jonathan D. [2 ]
Castillo Garcia, Rey [3 ]
机构
[1] SRI Int, Speech Technol & Res Lab, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA
[2] Gettysburg Coll, Gettysburg, PA 17325 USA
[3] Secretaria Educ Publ, Chilpancingo De Los Brav, State Of Guerre, Mexico
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
基金
美国国家科学基金会;
关键词
automatic speech recognition; endangered languages; large vocabulary continuous speech recognition; articulatory features; tonal features; acoustic-phonetic features; convolutional neural networks; RECOGNITION; FEATURES;
D O I
10.21437/Interspeech.2016-546
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The rate at which endangered languages can be documented has been highly constrained by human factors. Although digital recording of natural speech in endangered languages may proceed at a fairly robust pace, transcription of this material is not only time consuming but severely limited by the lack of native-speaker personnel proficient in the orthography of their mother tongue. Our NSF-funded project in the Documenting Endangered Languages (DEL) program proposes to tackle this problem from two sides: first via a tool that helps native speakers become proficient in the orthographic conventions of their language, and second by using automatic speech recognition (ASR) output that assists in the transcription effort for newly recorded audio data. In the present study, we focus exclusively on progress in developing speech recognition for the language of interest, Yoloxochitl Mixtec (YM), an Oto-Manguean language spoken by fewer than 5000 speakers on the Pacific coast of Guerrero, Mexico. In particular, we present results from an initial set of experiments and discuss future directions through which better and more robust acoustic models for endangered languages with limited resources can be created.
引用
收藏
页码:3076 / 3080
页数:5
相关论文
共 50 条
  • [41] Transfer Learning, Style Control, and Speaker Reconstruction Loss for Zero-Shot Multilingual Multi-Speaker Text-to-Speech on Low-Resource Languages
    Azizah, Kurniawati
    Jatmiko, Wisnu
    IEEE ACCESS, 2022, 10 : 5895 - 5911
  • [42] Effect of TTS Generated Audio on OOV Detection and Word Error Rate in ASR for Low-resource Languages
    Murthy, Savitha
    Sitaram, Dinkar
    Sitaram, Sunayana
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1026 - 1030
  • [43] No data to crawl? Monolingual corpus creation from PDF files of truly low-resource languages in Peru
    Bustamante, Gina
    Oncevay, Arturo
    Zariquiey, Roberto
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2914 - 2923
  • [44] Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition
    Farooq, Muhammad Umar
    Hain, Thomas
    INTERSPEECH 2023, 2023, : 5072 - 5076
  • [45] Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism
    Soky, Kak
    Li, Sheng
    Mimura, Masato
    Chu, Chenhui
    Kawahara, Tatsuya
    INTERSPEECH 2022, 2022, : 1362 - 1366
  • [46] A mechanism for personalized Automatic Speech Recognition for less frequently spoken languages: the Greek case
    Panagiotis Antoniadis
    Emmanouil Tsardoulias
    Andreas Symeonidis
    Multimedia Tools and Applications, 2022, 81 : 40635 - 40652
  • [47] A mechanism for personalized Automatic Speech Recognition for less frequently spoken languages: the Greek case
    Antoniadis, Panagiotis
    Tsardoulias, Emmanouil
    Symeonidis, Andreas
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (28) : 40635 - 40652
  • [48] Overcoming Data Sparsity in Acoustic Modeling of Low-Resource Language by Borrowing Data and Model Parameters from High-Resource Languages
    Abraham, Basil
    Umesh, S.
    Joy, Neethu Mariam
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3037 - 3041
  • [49] Data Augmentation, Feature Combination, and Multilingual Neural Networks to Improve ASR and KWS Performance for Low-resource Languages
    Tueske, Zoltan
    Golik, Pavel
    Nolden, David
    Schlueter, Ralf
    Ney, Hermann
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1420 - 1424
  • [50] The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech
    Do, Phat
    Coler, Matt
    Dijkstra, Jelske
    Klabbers, Esther
    INTERSPEECH 2023, 2023, : 5461 - 5465