Dynamic time warping based approach to text-dependent speaker identification using spectrograms

被引：13

作者：

Dutta, Tridibesh ^{[1
]}

机构：

[1] Indian Stat Inst, Kolkata, India

来源：

CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 2, PROCEEDINGS | 2008年

关键词：

D O I：

10.1109/CISP.2008.560

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The goal of this paper is to study a new approach to text dependent speaker identification using the complex patterns of variation infrequency and amplitude with time while an individual utters a given word through spectrogram segmentation and template matching. The optimally segmented spectrograms are used as a database to successfully identify, the unknown individual from his/her voice. The methodology used for identifying, rely on classification of spectrograms (of speech signals), based on dynamic time warping (DTW) matching of conditionally quantized frequency-time domain features of the database samples and the unknown speech sample. Experimental results on a sample collected from 40 speakers show that this methodology can be effectively used to produce a desirable success rate.

引用

页码：354 / 360

页数：7

共 11 条

[1]

[Anonymous], 2007, P IM VIS COMP

[2]

Demidenko E, 2004, LECT NOTES COMPUT SC, V3046, P933

[3]

Duda R. O, 2006, PATTERN CLASSIF

[4]

DUTTA T, 2007, PRIP 2007 P MINSK, V1, P87

[5]

GUPTA H, FIELD EVALUATION TEX

[6] Statistical pattern recognition: A review [J].

Jain, AK ;

Duin, RPW ;

Mao, JC .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (01) :4-37

[7]

Olsson J., TEXT DEPENDENT SPEAK

[8]

RATH TM, P 2003 IEEE COMP SOC, V2, P521

[9] SPEAKER IDENTIFICATION AND VERIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].

REYNOLDS, DA .

SPEECH COMMUNICATION, 1995, 17 (1-2) :91-108

[10] DYNAMIC-PROGRAMMING ALGORITHM OPTIMIZATION FOR SPOKEN WORD RECOGNITION [J].

SAKOE, H ;

CHIBA, S .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1978, 26 (01) :43-49

← 1 2 →