ACCELERATION OF SEQUENCE KERNEL COMPUTATION FOR REAL-TIME SPEAKER IDENTIFICATION

被引：6

作者：

Yamada, Makoto

Sugiyama, Masashi

Wichern, Gordon

Matsui, Tomoko

机构：

来源：

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年

关键词：

Sequence kernel; k-means algorithm; pre-image; Virtual Studio Technology (VST);

D O I：

10.1109/ICASSP.2010.5495542

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The sequence kernel has been shown to be a promising kernel function for learning from sequential data such as speech and DNA. However, it is not scalable to massive datasets due to its high computational cost. In this paper, we propose a method of approximating the sequence kernel that is shown to be computationally very efficient. More specifically, we formulate the problem of approximating the sequence kernel as the problem of obtaining a pre-image in a reproducing kernel Hilbert space. The effectiveness of the proposed approximation is demonstrated in text-independent speaker identification experiments with 10 male speakers-our approach provides significant reduction in computation time with limited performance degradation. Based on the proposed method, we develop a real-time kernel-based speaker identification system using Virtual Studio Technology (VST).

引用

页码：1626 / 1629

页数：4

共 6 条

[1] A kernel trick for sequences applied to text-independent speaker verification systems [J].

Mariethoz, Johnny ;

Bengio, Samy .

PATTERN RECOGNITION, 2007, 40 (08) :2315-2324

[2]

Rabiner L. R., 1993, Fundamentals of Speech Recognition

[3]

Scholkopf B, 2002, Encyclopedia of Biostatistics

[4]

TANABE K, 2001, 143 I STAT MATH

[5] An overview of automatic speaker diarization systems [J].

Tranter, Sue E. ;

Reynolds, Douglas A. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05) :1557-1565

[6]

YAMADA M, 2009, SIGNAL PROC IN PRESS

← 1 →