ACCELERATION OF SEQUENCE KERNEL COMPUTATION FOR REAL-TIME SPEAKER IDENTIFICATION

被引:6
作者
Yamada, Makoto
Sugiyama, Masashi
Wichern, Gordon
Matsui, Tomoko
机构
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
Sequence kernel; k-means algorithm; pre-image; Virtual Studio Technology (VST);
D O I
10.1109/ICASSP.2010.5495542
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The sequence kernel has been shown to be a promising kernel function for learning from sequential data such as speech and DNA. However, it is not scalable to massive datasets due to its high computational cost. In this paper, we propose a method of approximating the sequence kernel that is shown to be computationally very efficient. More specifically, we formulate the problem of approximating the sequence kernel as the problem of obtaining a pre-image in a reproducing kernel Hilbert space. The effectiveness of the proposed approximation is demonstrated in text-independent speaker identification experiments with 10 male speakers-our approach provides significant reduction in computation time with limited performance degradation. Based on the proposed method, we develop a real-time kernel-based speaker identification system using Virtual Studio Technology (VST).
引用
收藏
页码:1626 / 1629
页数:4
相关论文
共 6 条
[1]   A kernel trick for sequences applied to text-independent speaker verification systems [J].
Mariethoz, Johnny ;
Bengio, Samy .
PATTERN RECOGNITION, 2007, 40 (08) :2315-2324
[2]  
Rabiner L. R., 1993, Fundamentals of Speech Recognition
[3]  
Scholkopf B, 2002, Encyclopedia of Biostatistics
[4]  
TANABE K, 2001, 143 I STAT MATH
[5]   An overview of automatic speaker diarization systems [J].
Tranter, Sue E. ;
Reynolds, Douglas A. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05) :1557-1565
[6]  
YAMADA M, 2009, SIGNAL PROC IN PRESS