Verification effectiveness in open-set speaker identification

被引:18
作者
Ariyaeeinia, A. M. [1 ]
Fortuna, J.
Sivakumaran, P.
Malegaonkar, A.
机构
[1] Univ Hertfordshire, Sch Elect Communo & Elect Engn, Hatfield AL10 9AB, Herts, England
[2] Canon Res Ctr Euorpe Ltd, Bracknell RG12 2XH, Berks, England
来源
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING | 2006年 / 153卷 / 05期
关键词
D O I
10.1049/ip-vis:20050273
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Verification effectiveness in open-set, text-independent speaker identification is the authors' primary subject of concern. The study includes an analysis of the characteristics of this mode of speaker recognition and the potential causes of errors. The use of well-known score normalisation techniques for the purpose of enhancing the reliability of the process is described and their relative effectiveness is experimentally investigated. The experiments are based on the dataset proposed for the I-speaker detection task of the NIST Speaker Recognition Evaluation 2003. On the basis of experimental results, it is demonstrated that significant benefits are achieved by using score normalisation in open-set identification, and that the level of this depends highly on the type of approach adopted. The results also show that better performance can be achieved by using the cohort normalisation methods. In particular, the unconstrained cohort method with a relatively small cohort size appears to outperform all other approaches.
引用
收藏
页码:618 / 624
页数:7
相关论文
共 14 条
[1]  
[Anonymous], 1988, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
[2]  
Ariyaeeinia A. M., 1997, P EUR 97, P1379
[3]   Score normalization for text-independent speaker verification systems [J].
Auckenthaler, R ;
Carey, M ;
Lloyd-Thomas, H .
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :42-54
[4]  
Fortuna J., 2004, P SPEAK LANG REC WOR, P369
[5]   Text-independent speaker identification [J].
Gish, Herbert ;
Schmidt, Michael .
IEEE SIGNAL PROCESSING MAGAZINE, 1994, 11 (04) :18-32
[6]  
Higgins A., 1991, Digital Signal Processing, V1, P89, DOI 10.1016/1051-2004(91)90098-6
[7]   Text-independent speaker recognition using non-linear frame likelihood transformation [J].
Markov, KP ;
Nakagawa, S .
SPEECH COMMUNICATION, 1998, 24 (03) :193-209
[8]   SPEAKER IDENTIFICATION AND VERIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].
REYNOLDS, DA .
SPEECH COMMUNICATION, 1995, 17 (1-2) :91-108
[9]   ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].
REYNOLDS, DA ;
ROSE, RC .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01) :72-83
[10]  
REYNOLDS DA, 1997, P EUR, P963