Automatic Speech Recognition for Under-Resourced Languages: Application to Vietnamese Language

被引:57
作者
Le, Viet-Bac [1 ]
Besacier, Laurent [1 ]
机构
[1] Univ Grenoble 1, LIG Lab, CNRS, UMR 5217, F-38041 Grenoble 9, France
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2009年 / 17卷 / 08期
关键词
Crosslingual acoustic modeling; grapheme-based acoustic modeling; lattice decomposition and combination; speech recognition; under-resourced languages;
D O I
10.1109/TASL.2009.2021723
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents our work in automatic speech recognition (ASR) in the context of under-resourced languages with application to Vietnamese. Different techniques for bootstrapping acoustic models are presented. First, we present the use of acoustic-phonetic unit distances and the potential of crosslingual acoustic modeling for under-resourced languages. Experimental results on Vietnamese showed that with only a few hours of target language speech data, crosslingual context independent modeling worked better than crosslingual context dependent modeling. However, it was outperformed by the latter one, when more speech data were available. We concluded, therefore, that in both cases, crosslingual systems are better than monolingual baseline systems. The proposal of grapheme-based acoustic modeling, which avoids building a phonetic dictionary, is also investigated in our work. Finally, since the use of sub-word units (morphemes, syllables, characters, etc.) can reduce the high out-of-vocabulary rate and improve the lack of text resources in statistical language modeling for under-resourced languages, we propose several methods to decompose, normalize and combine word and sub-word lattices generated from different ASR systems. The proposed lattice combination scheme results in a relative syllable error rate reduction of 6.6% over the sentence MAP baseline method for a Vietnamese ASR task.
引用
收藏
页码:1471 / 1482
页数:12
相关论文
共 50 条
  • [31] Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages
    Do, Van Hai
    Chen, Nancy F.
    Lim, Boon Pang
    Hasegawa-Johnson, Mark
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3863 - 3867
  • [32] Deploying a Speech Recognition Model for Under-Resourced Languages: A Case Study on Dioula Wake Words 1, 2, 3, and 4
    Ouedraogo, Ismaila
    Some, Borlli Michel Jonas
    Keita, Zakaria Cheick Oumar
    Nabaloum, Emile
    Bationo, Fabrice
    Benedikter, Roland
    Diallo, Gayo
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023, 2023, : 111 - 118
  • [33] Matrix Covariance Estimation Methods for robust Security Speech Recognition with under-resourced conditions
    Barroso, N.
    De Ipina, K. Lopez
    Hernandez, C.
    Ezeiza, A.
    2011 IEEE INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2011,
  • [34] CODE-SWITCHED LANGUAGE MODELLING USING A CODE PREDICTIVE LSTM IN UNDER-RESOURCED SOUTH AFRICAN LANGUAGES
    van Vuren, Joshua Jansen
    Niesler, Thomas
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 785 - 791
  • [35] A Phone Mapping Technique for Acoustic Modeling of Under-resourced Languages
    Van Hai Do
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 233 - 236
  • [36] Multi-task learning in under-resourced Dravidian languages
    Adeep Hande
    Siddhanth U. Hegde
    Bharathi Raja Chakravarthi
    Journal of Data, Information and Management, 2022, 4 (2): : 137 - 165
  • [37] Network-Enabled Keyword Extraction for Under-Resourced Languages
    Beliga, Slobodan
    Martincic-Ipsic, Sanda
    SEMANTIC KEYWORD-BASED SEARCH ON STRUCTURED DATA SOURCES, IKC 2016, 2017, 10151 : 124 - 135
  • [38] A Statistical Method for Translating Chinese into Under-resourced Minority Languages
    Chen, Lei
    Li, Miao
    Zhang, Jian
    Zhu, Zede
    Yang, Zhenxin
    MACHINE TRANSLATION, CWMT 2014, 2014, 493 : 49 - 60
  • [39] Towards Learning Morphology for Under-Resourced Fusional and Agglutinating Languages
    Shalonova, Ksenia
    Golenia, Bruno
    Flach, Peter
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (05): : 956 - 965
  • [40] Cross-lingual acoustic modeling for under-resourced languages
    Song, Meixu
    Zhang, Qingqing
    Pan, Jielin
    Yan, Yonghong
    Journal of Computational Information Systems, 2015, 11 (14): : 5039 - 5046