Speech corpus for Medina dialect

被引：0

作者：

Khalafallah, Haneen Bahjat ^{[1
]}

Fattah, Mohamed Abdel ^{[2
]}

Abdulrahman, Ruqayya ^{[1
]}

机构：

[1] Taibah Univ, Coll Comp Sci & Engn, Medina, Saudi Arabia

[2] Helwan Univ, Dept Elect Technol, FTE, Cairo, Egypt

来源：

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES | 2024年 / 36卷 / 02期

关键词：

Machine Learning (ML); Natural Language Processing (NLP); Automatic Speech Recognition; Arabic ASR Speech Corpus; Arabic Dialects; CMU Sphinx; SPEAKER IDENTIFICATION; RECOGNITION SYSTEM;

D O I：

10.1016/j.jksuci.2023.101864

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Automatic Speech Recognition (ASR) has standard rules which must be followed and considered carefully. Some difficulties that lead to less ASR performance is variations in pronunciation and small words misrecognition. Arabic ASR faces some challenges like difficulty in obtaining corpora for spoken dialects. Obtaining a wide range of diacritized text as well as the enormous number of word forms is considered a major challenge due to the Arabic language morphology richness and its' letters capability to be written without diacritics. Although Arabic is one of the most popular languages, Arabic ASR systems are still rare compared with other languages. As ASR systems depend primarily on speech corpuses, Arabic ASR systems requires specific-dialect speech corpuses. Such speech corpuses are still deficient, costly, nor sometimes exists. In this research, we contribute to overcome the lack of speech recognition and misunderstanding for one of the most famous dialects in Saudi Arabia, Medina. We created a brand-new corpus "Haneen Corpus", which consists of 70,364 tokens that have been uttered using Medina dialect, and constructed a dictionary using 64 phonemes to analyse the correct pronunciation of words. Our Medina Dialect ASR System exploited Hidden Markov Models (HMM) that achieved 92.09 % speech recognition accuracy.

引用

页数：18

共 61 条

[1] Automatic diacritization of Arabic text using recurrent neural networks [J].

Abandah, Gheith A. ;

Graves, Alex ;

Al-Shagoor, Balkees ;

Arabiyat, Alaa ;

Jamour, Fuad ;

Al-Taee, Majid .

INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2015, 18 (02) :183-197

[2] Diacritics Effect on Arabic Speech Recognition [J].

Abed, Sa'ed ;

Alshayeji, Mohammad ;

Sultan, Sari .

ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (11) :9043-9056

[3]

Abu Kwaik K, 2018, PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), P3645

[4]

Abushariah MAM., 2010, INT C COMP COMM ENG, P1

[5]

Abushariah M, 2012, INT ARAB J INF TECHN, V9, P84

[6] Cross-word Arabic pronunciation variation modeling for speech recognition [J].

AbuZeina, Dia ;

Al-Khatib, Wasfi ;

Elshafei, Moustafa ;

Al-Muhtaseb, Husni .

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2011, 14 (03) :227-236

[7]

Al-Anzi F.S., 2017, Int. J. Comput. Inf. Eng., V11, P1101

[8]

Al-Anzi FS, 2017, IEEE JORDAN CONF APP

[9] The impact of phonological rules on Arabic speech recognition [J].

Al-Anzi F.S. ;

AbuZeina D. .

International Journal of Speech Technology, 2017, 20 (03) :715-723

[10] A Word-Dependent Automatic Arabic Speaker Identification System [J].

Al-Dahri, Suliman S. ;

Al-Jassar, Youssaf H. ;

Alotaibi, Yousef A. ;

Alsulaiman, Mansour M. ;

Abdullah-Al-Mamun, Khondaker .

ISSPIT: 8TH IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2008, :198-202

← 1 2 3 4 5 6 7 →