Active learning:: Theory and applications to automatic speech recognition

被引：111

作者：

Riccardi, G ^{[1
]}

Hakkani-Tür, D ^{[1
]}

机构：

[1] AT&T Labs Res, Florham Pk, NJ 07932 USA

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2005年 / 13卷 / 04期

关键词：

acoustic modeling; active learning; language modeling; large vocabulary continuous speech recognition; machine learning;

D O I：

10.1109/TSA.2005.848882

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We are interested in the problem of adaptive learning in the context of automatic speech recognition (ASR). In this paper, we propose an active learning algorithm for ASR. Automatic speech recognition systems are trained using human supervision to provide transcriptions of speech utterances. The goal, of Active Learning is to minimize the human supervision for training acoustic and language models and to maximize the performance given the transcribed and untranscribed data. Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples, and then selecting the most informative ones with respect to a given cost function for a human to label. In this paper we describe how to estimate the confidence score for each utterance through an on-line algorithm using the lattice output of a speech recognizer. The utterance scores are filtered through the informativeness function and an optimal subset of training samples is selected. The active learning algorithm has been applied to both batch and on-line learning scheme and we have experimented with different selective sampling algorithms. Our experiments show that by using active learning the amount of labeled data needed for a given word accuracy can be reduced by more than 60 % with respect to random sampling.

引用

页码：504 / 511

页数：8

共 34 条

[1]

[Anonymous], 1998, DARPA BROADCAST NEWS

[2]

[Anonymous], THESIS U SO CALIFORN

[3] IMPROVING GENERALIZATION WITH ACTIVE LEARNING [J].

COHN, D ;

ATLAS, L ;

LADNER, R .

MACHINE LEARNING, 1994, 15 (02) :201-221

[4]

Critchlow D. E., 1980, METRIC METHODS ANAL

[5]

Dagan I., 1995, P 12 INT C MACH LEAR, P150, DOI [10.1016/B978-1-55860-377-6.50027-X, DOI 10.1016/B978-1-55860-377-6.50027-X]

[6]

De Mori R., 1998, SPOKEN DIALOGUES COM

[7] SPEAKER ADAPTATION USING CONSTRAINED ESTIMATION OF GAUSSIAN MIXTURES [J].

DIGALAKIS, VV ;

RTISCHEV, D ;

NEUMEYER, LG .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (05) :357-366

[8]

FALAVIGNA D, 2002, P INT C SPOK LANG PR

[9]

Federico M, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P240, DOI 10.1109/ICSLP.1996.607087

[10]

FREUND Y, 1994, MACH LEARN, V15, P201

← 1 2 3 4 →