Exploring different attributes of source information for speaker verification with limited test data

被引：40

作者：

Das, Rohan Kumar ^{[1
]}

Prasanna, S. R. Mahadeva ^{[1
]}

机构：

[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, Assam, India

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2016年 / 140卷 / 01期

关键词：

EXTRACTION; FEATURES;

D O I：

10.1121/1.4954653

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This work explores mel power difference of spectrum in subband, residual mel frequency cepstral coefficient, and discrete cosine transform of the integrated linear prediction residual for speaker verification under limited test data conditions. These three source features are found to capture different attributes of source information, namely, periodicity, smoothed spectrum information, and shape of the glottal signal, respectively. On the NIST SRE 2003 database, the proposed combination of the three source features performs better [equal error rate (EER): 20.19%, decision cost function (DCF): 0.3759] than the mel frequency cepstral coefficient feature (EER: 22.31%, DCF: 0.4128) for 2 s duration of test segments. (C) 2016 Acoustical Society of America.

引用

页码：184 / 190

页数：7

共 20 条

[1] Detection of the closure-burst transitions of stops and affricates in continuous speech using the plosion index [J].

Ananthapadmanabha, T. V. ;

Prathosh, A. P. ;

Ramakrishnan, A. G. .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 135 (01) :460-471

[2]

[Anonymous], 2010, 2010 NATL C COMMUNIC

[3]

[Anonymous], NIST YEAR 2003 SPEAK

[4]

[Anonymous], INTERSPEECH

[5] Discrimination power of vocal source and vocal tract related features for speaker segmentation [J].

Chan, Wai Nang ;

Zheng, Nengheng ;

Lee, Tan .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (06) :1884-1892

[6]

Das R. K., 2014, INTERSPEECH

[7]

Das RK, 2015, NATL CONF COMMUN

[8] Front-End Factor Analysis for Speaker Verification [J].

Dehak, Najim ;

Kenny, Patrick J. ;

Dehak, Reda ;

Dumouchel, Pierre ;

Ouellet, Pierre .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798

[9]

Dey S, 2014, NATL CONF COMMUN

[10]

Fisher W., 1986, PROC DARPA WORKSHOP, P93

← 1 2 →