Speaker identification based on the frame linear predictive coding spectrum technique

被引：13

作者：

Wu, Jian-Da ^{[1
]}

Lin, Bing-Fu ^{[1
]}

机构：

[1] Natl Changhua Univ Educ, Grad Inst Vehicle Engn, Changhua 500, Taiwan

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2009年 / 36卷 / 04期

关键词：

Speaker identification; Linear predictive coding; Gaussian mixture model; General regression neural network; CONTINUOUS WAVELET TRANSFORM; FUZZY INFERENCE; NEURAL-NETWORKS; RECOGNITION; SYSTEM;

D O I：

10.1016/j.eswa.2008.10.051

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a frame linear predictive coding spectrum (FLPCS) technique for speaker identification is presented. Traditionally, linear predictive coding (LPC) was applied in many speech recognition applications, nevertheless, the modification of LPC termed FLPCS is proposed in this study for speaker identification. The analysis procedure consists of feature extraction and voice classification. In the stage of feature extraction, the representative characteristics were extracted using the FLPCS technique. Through the approach, the size of the feature vector of a speaker can be reduced within an acceptable recognition rate. In the stage of classification, general regression neural network (GRNN) and Gaussian mixture model (GMM) were applied because of their rapid response and simplicity in implementation. In the experimental investigation, performances of different order FLPCS coefficients which were induced from the LPC spectrum were compared with one another. Further, the capability analysis on GRNN and GMM was also described. The experimental results showed GMM can achieve a better recognition rate with feature extraction using the FLPCS method. It is also suggested the GMM can complete training and identification in a very short time. (c) 2008 Elsevier Ltd. All rights reserved.

引用

页码：8056 / 8063

页数：8

共 20 条

[1] A speaker identification system using a model of artificial neural networks for an elevator application
Adami, AG
Barone, DAC
[J]. INFORMATION SCIENCES, 2001, 138 (1-4) : 1 - 5
[2] [Anonymous], P IEEE NORD SIGN PRO
[3] The history of linear prediction
Atal, BS
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2006, 23 (02) : 154 - +
[4] SPEECH ANALYSIS AND SYNTHESIS BY LINEAR PREDICTION OF SPEECH WAVE
ATAL, BS
HANAUER, SL
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 50 (02) : 637 - +
[5] Speech recognition using a wavelet packet adaptive network based fuzzy inference system
Avci, Engin
Akpolat, Zuhtu Hakan
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2006, 31 (03) : 495 - 503
[6] Childers D.G., 2000, Speech processing and synthesis toolboxes
[7] Speaker identification through use of features selected using genetic algorithm
Haydar, A
Demirekler, M
Yurtseven, MK
[J]. ELECTRONICS LETTERS, 1998, 34 (01) : 39 - 40
[8] Bearing fault diagnosis based on wavelet transform and fuzzy inference
Lou, XS
Loparo, KA
[J]. MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2004, 18 (05) : 1077 - 1095
[9] Wavelet feature selection based neural networks with application to the text independent speaker identification
Lung, SY
[J]. PATTERN RECOGNITION, 2006, 39 (08) : 1518 - 1521
[10] Text-independent speaker recognition using non-linear frame likelihood transformation
Markov, KP
Nakagawa, S
[J]. SPEECH COMMUNICATION, 1998, 24 (03) : 193 - 209

← 1 2 →