Recurrent Out-of-Vocabulary Word Detection Using Distribution of Features

被引：3

作者：

Asami, Taichi ^{[1
]}

Masumura, Ryo ^{[1
]}

Aono, Yushi ^{[1
]}

Shinoda, Koichi ^{[2
]}

机构：

[1] NTT Corp, NTT Media Intelligence Labs, Tokyo, Japan

[2] Tokyo Inst Technol, Tokyo, Japan

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

speech recognition; OOV word detection; recurrent OOV words; distribution of features; SPEECH RECOGNITION;

D O I：

10.21437/Interspeech.2016-562

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The repeated use of out-of-vocabulary (OOV) words in a spoken document seriously degrades a speech recognizer's performance. This paper provides a novel method for accurately detecting such recurrent OOV words. Standard OOV word detection methods classify each word segment into in-vocabulary (IV) or OOV. This word-by-word classification tends to be affected by sudden vocal irregularities in spontaneous speech, triggering false alarms. To avoid this sensitivity to the irregularities, our proposal focuses on consistency of the repeated occurrence of OOV words. The proposed method preliminarily detects recurrent segments, segments that contain the same word, in a spoken document by open vocabulary spoken term discovery using a phoneme recognizer. If the recurrent segments are OOV words, features for OOV detection in those segments should exhibit consistency. We capture this consistency by using the mean and variance (distribution) of features (DOF) derived from the recurrent segments, and use the DOF for IV/OOV classification. Experiments illustrate that the proposed method's use of the DOF significantly improves its performance in recurrent OOV word detection.

引用

页码：1320 / 1324

页数：5

共 18 条

[1]

[Anonymous], NTT TECHNICAL REV

[2]

[Anonymous], 1998, PROC BROADCAST NEWS

[3]

[Anonymous], 2006, P 1 WORKSHOP GRAPH B, DOI DOI 10.3115/1654758.1654774

[4] Clustering and DiversifyingWeb Search Results with Graph-Based Word Sense Induction [J].