Language recognition with discriminative keyword selection

被引：25

作者：

Richardson, F. S. ^{[1
]}

Campbell, W. M. ^{[1
]}

机构：

[1] MIT, Lincoln Lab, Cambridge, MA 02139 USA

来源：

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年

关键词：

language recognition; support vector machines;

D O I：

10.1109/ICASSP.2008.4518567

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

One commonly used approach for language recognition is to convert the input speech into a sequence of tokens such as words or phones and then to use these token sequences to determine the target language. The language classification is typically performed by extracting N-gram statistics from the token sequences and then using an N-gram language model or support vector machine (SVM) to perform the classification. One problem with these approaches is that the number of N-grams grows exponentially as the order N is increased. This is especially problematic for an SVM classifier as each utterance is represented as a distinct N-gram vector. In this paper we propose a novel approach for modeling higher order N-grams using an SVM via an alternating filter-wrapper feature selection method. We demonstrate the effectiveness of this technique on the NIST 2007 language recognition task.

引用

页码：4145 / 4148

页数：4