Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

被引：0

作者：

Abderrahim Ezzine ^{[1
]}

Naouar Laaidi ^{[2
]}

Ouissam Zealouk ^{[1
]}

Hassan Satori ^{[2
]}

机构：

[1] Department of Computer Science and Mathematics, Faculty of Sciences Dhar Mahraz, Sidi Mohamed Ben Abbdallah University, Fez

[2] Laboratory of Computer Science, Signals, Automation and Cognition (LISAC), Fez

来源：

SN Computer Science | / 5卷 / 6期

关键词：

HMM-GMM; In-house corpus; Machine learning; Moroccan dialect; Phoneme modeling; Speech recognition;

D O I：

10.1007/s42979-024-03108-5

中图分类号：

学科分类号：

摘要：

Building efficient acoustic models for dialects is a major challenge in Automatic Speech Recognition (ASR) systems. In this paper, we investigate the Moroccan Fessi dialect speech recognition system based on phoneme modeling. We employed a combined approach, including the Hidden Markov Model (HMM) and the Gaussian Mixture Model (GMM). Also, the ASR dialect specificity was analysed, including phonemes nature and phonetic inventory. Our results show the best performance was found by using 3 HMM and 4 GMM configurations, achieving an accuracy of 97.33%. Additionally, we observed that the digits containing voiced pharyngeal phonemes, particularly the phoneme /ʕ/, achieved the highest recognition rate, while words containing the phoneme /s/ exhibited multiple substitutions. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2024.

引用

共 39 条

[1]

Portet F., Et al., Design and evaluation of a smart home voice interface for the elderly: acceptability and objection aspects, Pers Ubiquit Comput, 17, pp. 127-144, (2013)

[2]

Yuan X., Et al., Commandersong: A systematic approach for practical adversarial voice recognition, 27Th USENIX Security Symposium (USENIX Security 18, (2018)

[3]

Deng L., Li X., Machine learning paradigms for speech recognition: an overview, IEEE Trans Audio Speech Lang Process, 21, 5, pp. 1060-1089, (2013)

[4]

Junqua J.-C., Robust speech recognition in embedded systems and PC applications, (2000)

[5]

Antal M., Phonetic speaker recognition, Proc. of the 7Th International Conference COMMUNICATIONS, pp. 67-72, (2008)

[6]

Savic M., Jeffrey S., Phoneme-based speaker verification, Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2, pp. 165-168, (1992)

[7]

Imperl B., Et al., The advantage of spectral comparison of phonemes for speaker recognition, Slovensko Drustvo Za Razpoznavanje Vzorcev. Workshop, (1996)

[8]

Bhatt S., Dev A., Jain A., Confusion analysis in phoneme based speech recognition in Hindi, J Ambient Intell Humaniz Comput, 11, pp. 4213-4238, (2020)

[9]

Alsharhan E., Ramsay A., Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition, Lang Resour Eval, 54, 4, pp. 975-998, (2020)

[10]

Elharati H.A., Alshaari M., Kepuska V.Z., Arabic speech recognition system based on MFCC and HMMs, J Comput Commun, 8, 3, pp. 28-34, (2020)

← 1 2 3 4 →