Speaker identification using hybrid Karhunen-Loeve transform and Gaussian mixture model approach

被引：3

作者：

Chen, CCT ^{[1
]}

Chen, CT ^{[1
]}

Hou, CK ^{[1
]}

机构：

[1] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 804, Taiwan

来源：

PATTERN RECOGNITION | 2004年 / 37卷 / 05期

关键词：

Karhunen-Loeve transform; Bhattacharyya distance; Gaussian mixture models; speaker identification; Mel frequency cepstral coefficients;

D O I：

10.1016/j.patcog.2003.08.013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a classification scheme that incorporates Karhunen-Loeve transform (KLT) and Gaussian mixture model (GMM) for text-independent speaker identification. Our results show that the combination is beneficial to both classification accuracy and computational cost. For a database with 500 Mandarin speakers, it is demonstrated that accuracy improvement of up to 4% and computational cost saving of 10 times compared to those of the conventional GMM model can be achieved. (C) 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

引用

页码：1073 / 1075

页数：3