Automatic Speech Recognition for Under-Resourced Languages: Application to Vietnamese Language

被引:57
作者
Le, Viet-Bac [1 ]
Besacier, Laurent [1 ]
机构
[1] Univ Grenoble 1, LIG Lab, CNRS, UMR 5217, F-38041 Grenoble 9, France
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2009年 / 17卷 / 08期
关键词
Crosslingual acoustic modeling; grapheme-based acoustic modeling; lattice decomposition and combination; speech recognition; under-resourced languages;
D O I
10.1109/TASL.2009.2021723
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents our work in automatic speech recognition (ASR) in the context of under-resourced languages with application to Vietnamese. Different techniques for bootstrapping acoustic models are presented. First, we present the use of acoustic-phonetic unit distances and the potential of crosslingual acoustic modeling for under-resourced languages. Experimental results on Vietnamese showed that with only a few hours of target language speech data, crosslingual context independent modeling worked better than crosslingual context dependent modeling. However, it was outperformed by the latter one, when more speech data were available. We concluded, therefore, that in both cases, crosslingual systems are better than monolingual baseline systems. The proposal of grapheme-based acoustic modeling, which avoids building a phonetic dictionary, is also investigated in our work. Finally, since the use of sub-word units (morphemes, syllables, characters, etc.) can reduce the high out-of-vocabulary rate and improve the lack of text resources in statistical language modeling for under-resourced languages, we propose several methods to decompose, normalize and combine word and sub-word lattices generated from different ASR systems. The proposed lattice combination scheme results in a relative syllable error rate reduction of 6.6% over the sentence MAP baseline method for a Vietnamese ASR task.
引用
收藏
页码:1471 / 1482
页数:12
相关论文
共 50 条
  • [41] SMT-based ASR domain adaptation methods for under-resourced languages: Application to Romanian
    Cucu, Horia
    Buzo, Andi
    Besacier, Laurent
    Burileanu, Corneliu
    [J]. SPEECH COMMUNICATION, 2014, 56 : 195 - 212
  • [42] Multilingual Sentiment Analysis for Under-Resourced Languages: A Systematic Review of the Landscape
    Mabokela, Koena Ronny
    Celik, Turgay
    Raborife, Mpho
    [J]. IEEE ACCESS, 2023, 11 : 15996 - 16020
  • [43] Multilingual Query by Example Spoken Term Detection for Under-Resourced Languages
    Buzo, Andi
    Cucu, Horia
    Safta, Mihai
    Burileanu, Corneliu
    [J]. 2013 7TH CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN - COMPUTER DIALOGUE (SPED), 2013,
  • [44] The Multilingual GRUG Parallel Treebank - Syntactic Annotation for Under-Resourced Languages
    Kapanadze, Oleg
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [45] A Study of Levenshtein Transformer and Editor Transformer Models for Under-Resourced Languages
    [J]. 16TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2021), 2021,
  • [46] Transfer of Models and Resources for Under-Resourced Languages Semantic Role Labeling
    Mohamed, Yesuf
    Menzel, Wolfgang
    [J]. PAN-AFRICAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PT I, PANAFRICON AI 2023, 2024, 2068 : 141 - 153
  • [47] Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages
    Biswas, Astik
    Yilmaz, Emre
    de Wet, Febe
    Van der Westhuizen, Ewald
    Niesler, Thomas
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3468 - 3474
  • [48] Using different acoustic, lexical and language modeling units for ASR of an under-resourced language - Amharic
    Tachbelie, Martha Yifiru
    Abate, Solomon Teferra
    Besacier, Laurent
    [J]. SPEECH COMMUNICATION, 2014, 56 : 181 - 194
  • [49] Lexicon plus TX: rapid construction of a multilingual lexicon with under-resourced languages
    Lim, Lian Tze
    Soon, Lay-Ki
    Lim, Tek Yong
    Tang, Enya Kong
    Ranaivo-Malancon, Bali
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2014, 48 (03) : 479 - 492
  • [50] Two-stage spoken term detection system for under-resourced languages
    Deekshitha, G.
    Mary, Leena
    [J]. IET SIGNAL PROCESSING, 2020, 14 (09) : 602 - 613