Automatic Speech Recognition for Under-Resourced Languages: Application to Vietnamese Language

被引：57

作者：

Le, Viet-Bac ^{[1
]}

Besacier, Laurent ^{[1
]}

机构：

[1] Univ Grenoble 1, LIG Lab, CNRS, UMR 5217, F-38041 Grenoble 9, France

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2009年 / 17卷 / 08期

关键词：

Crosslingual acoustic modeling; grapheme-based acoustic modeling; lattice decomposition and combination; speech recognition; under-resourced languages;

D O I：

10.1109/TASL.2009.2021723

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents our work in automatic speech recognition (ASR) in the context of under-resourced languages with application to Vietnamese. Different techniques for bootstrapping acoustic models are presented. First, we present the use of acoustic-phonetic unit distances and the potential of crosslingual acoustic modeling for under-resourced languages. Experimental results on Vietnamese showed that with only a few hours of target language speech data, crosslingual context independent modeling worked better than crosslingual context dependent modeling. However, it was outperformed by the latter one, when more speech data were available. We concluded, therefore, that in both cases, crosslingual systems are better than monolingual baseline systems. The proposal of grapheme-based acoustic modeling, which avoids building a phonetic dictionary, is also investigated in our work. Finally, since the use of sub-word units (morphemes, syllables, characters, etc.) can reduce the high out-of-vocabulary rate and improve the lack of text resources in statistical language modeling for under-resourced languages, we propose several methods to decompose, normalize and combine word and sub-word lattices generated from different ASR systems. The proposed lattice combination scheme results in a relative syllable error rate reduction of 6.6% over the sentence MAP baseline method for a Vietnamese ASR task.

引用

页码：1471 / 1482

页数：12

共 50 条

[31] Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages
Do, Van Hai
Chen, Nancy F.
Lim, Boon Pang
Hasegawa-Johnson, Mark
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3863 - 3867
[32] Deploying a Speech Recognition Model for Under-Resourced Languages: A Case Study on Dioula Wake Words 1, 2, 3, and 4
Ouedraogo, Ismaila
Some, Borlli Michel Jonas
Keita, Zakaria Cheick Oumar
Nabaloum, Emile
Bationo, Fabrice
Benedikter, Roland
Diallo, Gayo
PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023, 2023, : 111 - 118
[33] Matrix Covariance Estimation Methods for robust Security Speech Recognition with under-resourced conditions
Barroso, N.
De Ipina, K. Lopez
Hernandez, C.
Ezeiza, A.
2011 IEEE INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2011,
[34] CODE-SWITCHED LANGUAGE MODELLING USING A CODE PREDICTIVE LSTM IN UNDER-RESOURCED SOUTH AFRICAN LANGUAGES
van Vuren, Joshua Jansen
Niesler, Thomas
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 785 - 791
[35] A Phone Mapping Technique for Acoustic Modeling of Under-resourced Languages
Van Hai Do
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 233 - 236
[36] Multi-task learning in under-resourced Dravidian languages
Adeep Hande
Siddhanth U. Hegde
Bharathi Raja Chakravarthi
Journal of Data, Information and Management, 2022, 4 (2): : 137 - 165
[37] Network-Enabled Keyword Extraction for Under-Resourced Languages
Beliga, Slobodan
Martincic-Ipsic, Sanda
SEMANTIC KEYWORD-BASED SEARCH ON STRUCTURED DATA SOURCES, IKC 2016, 2017, 10151 : 124 - 135
[38] A Statistical Method for Translating Chinese into Under-resourced Minority Languages
Chen, Lei
Li, Miao
Zhang, Jian
Zhu, Zede
Yang, Zhenxin
MACHINE TRANSLATION, CWMT 2014, 2014, 493 : 49 - 60
[39] Towards Learning Morphology for Under-Resourced Fusional and Agglutinating Languages
Shalonova, Ksenia
Golenia, Bruno
Flach, Peter
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (05): : 956 - 965
[40] Cross-lingual acoustic modeling for under-resourced languages
Song, Meixu
Zhang, Qingqing
Pan, Jielin
Yan, Yonghong
Journal of Computational Information Systems, 2015, 11 (14): : 5039 - 5046

← 1 2 3 4 5 →