Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition

被引：11

作者：

Nahar, Khalid M. O. ^{[1
]}

Abu Shquier, Mohammed ^{[2
]}

Al-Khatib, Wasfi G. ^{[3
]}

Al-Muhtaseb, Husni ^{[3
]}

Elshafei, Moustafa ^{[4
]}

机构：

[1] Yarmouk Univ, Fac Comp Sci & Informat Technol, Dept Comp Sci, Irbid 21163, Jordan

[2] Jarash Univ, Fac Comp Sci & Informat Technol, Dept Comp Sci, Jarash, Jordan

[3] King Fahd Univ Petr & Minerals, Informat & Comp Sci Dept, Dhahran 31261, Saudi Arabia

[4] King Fahd Univ Petr & Minerals, Dept Syst Engn, Dhahran 31261, Saudi Arabia

来源：

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY | 2016年 / 19卷 / 03期

关键词：

Learning vector quantization (LVQ); Codebooks; K-means algorithm; Phonemes transcription; Hidden Markov model (HMM); Hybrid LVQ/HMM model;

D O I：

10.1007/s10772-016-9337-5

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In attempt to increase the rate of Arabic phonemes recognition, we introduce a novel hybrid recognition algorithm. The algorithm is composed of the learning vector quantization (LVQ) and hidden Markov model (HMM). The hybrid algorithm used to recognizing Arabic phonemes in continuous open-vocabulary speech. A recorded Arabic corpus of different TV news for modern standard Arabic was used for training and testing purposes. We employ a data driven approach to generate the training feature vectors that embed the frame neighboring correlation information. Next, we generate the phonemes codebooks using the K-means splitting algorithm. Then, we trained the generated codebooks using the LVQ algorithm. We achieved a performance of 98.49 % during independent classification training and 90 % during dependent classification training. When using the trained LVQ codebooks in Arabic utterance transcription, the phoneme recognition rate was 72 % using LVQ only. We combined the LVQ codebooks with the single state HMM model using enhanced Viterbi algorithm which includes the phonemes bigrams. We achieved 89 % of Arabic phonemes recognition rate based on the hybrid LVQ/HMM algorithm.

引用

页码：495 / 508

页数：14

共 22 条

[1] Within-word pronunciation variation modeling for Arabic ASRs: a direct data-driven approach [J].

AbuZeina, Dia ;

Al-Khatib, Wasfi ;

Elshafei, Moustafa ;

Al-Muhtaseb, Husni .

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) :65-75

[2]

Al-Manie Mohammed A., 2010, INDIAN J SCI TECHNOL, V3, P1134

[3]

Ali M, 2009, J INF TECHNOL RES, V2, P67, DOI 10.4018/jilr.2009062905

[4]

AVDAGIC Z, 2007, 2007 IEEE INT C SIGN, P1195

[5] Competitive radial basis functions training for phone classification [J].

Cosi, P ;

Frasconi, P ;

Gori, M ;

Lastrucci, L ;

Soda, G .

NEUROCOMPUTING, 2000, 34 :117-129

[6]

Ding-Ding Ma, 2012, 2012 International Conference on Machine Learning and Cybernetics (ICMLC 2012). Proceedings, P792, DOI 10.1109/ICMLC.2012.6359026

[7]

Essa E., 2008, P 2008 IEEE INT C CO

[8]

Gemmeke J., 2009, P EUR SIGN PROC C GL, P24

[9] DISTRIBUTED AND LOCAL NEURAL CLASSIFIERS FOR PHONEME RECOGNITION [J].

GURGEN, F ;

ALPAYDIN, R ;

UNLUAKIN, U ;

ALPAYDIN, E .

PATTERN RECOGNITION LETTERS, 1994, 15 (11) :1111-1118

[10]

Katagiri S., 1988, SP88104 7ECE

← 1 2 3 →