Multitaper Estimation of Frequency-Warped Cepstra With Application to Speaker Verification

被引：23

作者：

Sandberg, Johan ^{[1
]}

Hansson-Sandsten, Maria ^{[1
]}

Kinnunen, Tomi ^{[2
]}

Saeidi, Rahim ^{[2
]}

Flandrin, Patrick ^{[3
]}

Borgnat, Pierre ^{[3
]}

机构：

[1] Lund Univ, Ctr Math Sci, SE-22100 Lund, Sweden

[2] Univ Eastern Finland, Dept Comp Sci & Stat, Speech & Image Proc Unit, FIN-80101 Joensuu, Finland

[3] Ecole Normale Super Lyon, CNRS, UMR 5672, Phys Lab, F-69364 Lyon, France

来源：

IEEE SIGNAL PROCESSING LETTERS | 2010年 / 17卷 / 04期

基金：

瑞典研究理事会;

关键词：

Cepstral analysis; MFCC; multiple windows; multitapers; speaker verification; speech analysis; GAUSSIAN MIXTURE-MODELS; SPECTRAL ESTIMATION; RECOGNITION; STATISTICS;

D O I：

10.1109/LSP.2010.2040228

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Usually the mel-frequency cepstral coefficients are estimated either from a periodogram or from a windowed periodogram. We state a general estimator which also includes multitaper estimators. We propose approximations of the variance and bias of the estimate of each coefficient. By using Monte Carlo computations, we demonstrate that the approximations are accurate. Using the proposed formulas, the peak matched multitaper estimator is shown to have low mean square error (squared bias variance) on speech-like processes. It is also shown to perform slightly better in the NIST 2006 speaker verification task as compared to the Hamming window conventionally used in this context.

引用

页码：343 / 346

页数：4

共 14 条

[1]

[Anonymous], 2001, Discrete-Time Speech Signal Processing:Principles and Practice

[2]

BOGERT BP, 1963, P S TIM SER AN, P15

[3] Support vector machines for speaker and language recognition [J].