Speaker identification using hybrid Karhunen-Loeve transform and Gaussian mixture model approach

被引:3
作者
Chen, CCT [1 ]
Chen, CT [1 ]
Hou, CK [1 ]
机构
[1] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 804, Taiwan
关键词
Karhunen-Loeve transform; Bhattacharyya distance; Gaussian mixture models; speaker identification; Mel frequency cepstral coefficients;
D O I
10.1016/j.patcog.2003.08.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a classification scheme that incorporates Karhunen-Loeve transform (KLT) and Gaussian mixture model (GMM) for text-independent speaker identification. Our results show that the combination is beneficial to both classification accuracy and computational cost. For a database with 500 Mandarin speakers, it is demonstrated that accuracy improvement of up to 4% and computational cost saving of 10 times compared to those of the conventional GMM model can be achieved. (C) 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1073 / 1075
页数:3
相关论文
共 3 条
  • [1] Speaker recognition: A tutorial
    Campbell, JP
    [J]. PROCEEDINGS OF THE IEEE, 1997, 85 (09) : 1437 - 1462
  • [2] Hard-limited Karhunen-Loeve transform for text independent speaker recognition
    Chen, CCT
    Chen, CT
    Tsai, CM
    [J]. ELECTRONICS LETTERS, 1997, 33 (24) : 2014 - 2016
  • [3] ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS
    REYNOLDS, DA
    ROSE, RC
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01): : 72 - 83