Speaker identification based on the frame linear predictive coding spectrum technique

被引:13
作者
Wu, Jian-Da [1 ]
Lin, Bing-Fu [1 ]
机构
[1] Natl Changhua Univ Educ, Grad Inst Vehicle Engn, Changhua 500, Taiwan
关键词
Speaker identification; Linear predictive coding; Gaussian mixture model; General regression neural network; CONTINUOUS WAVELET TRANSFORM; FUZZY INFERENCE; NEURAL-NETWORKS; RECOGNITION; SYSTEM;
D O I
10.1016/j.eswa.2008.10.051
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a frame linear predictive coding spectrum (FLPCS) technique for speaker identification is presented. Traditionally, linear predictive coding (LPC) was applied in many speech recognition applications, nevertheless, the modification of LPC termed FLPCS is proposed in this study for speaker identification. The analysis procedure consists of feature extraction and voice classification. In the stage of feature extraction, the representative characteristics were extracted using the FLPCS technique. Through the approach, the size of the feature vector of a speaker can be reduced within an acceptable recognition rate. In the stage of classification, general regression neural network (GRNN) and Gaussian mixture model (GMM) were applied because of their rapid response and simplicity in implementation. In the experimental investigation, performances of different order FLPCS coefficients which were induced from the LPC spectrum were compared with one another. Further, the capability analysis on GRNN and GMM was also described. The experimental results showed GMM can achieve a better recognition rate with feature extraction using the FLPCS method. It is also suggested the GMM can complete training and identification in a very short time. (c) 2008 Elsevier Ltd. All rights reserved.
引用
收藏
页码:8056 / 8063
页数:8
相关论文
共 20 条