Recent progress in corpus-based spontaneous speech recognition

被引:19
作者
Furui, S [1 ]
机构
[1] Tokyo Inst Technol, Tokyo 1528552, Japan
关键词
spontaneous speech recognition; corpus; model adaptation; indexing; summarization;
D O I
10.1093/ietisy/e88-d.3.366
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper overviews recent progress in the development of corpus-based spontaneous speech recognition technology. Although speech is in almost any situation spontaneous, recognition of spontaneous speech is an area which has only recently emerged in the field of automatic speech recognition. Broadening the application of speech recognition depends crucially on raising recognition performance for spontaneous speech. For this purpose, it is necessary to build large spontaneous speech corpora for constructing acoustic and language models. This paper focuses on various achievements of a Japanese 5-year national project "Spontaneous Speech: Corpus and Processing Technology" that has recently been completed. Because of various spontaneous-speech specific phenomena, such as filled pauses, repairs, hesitations, repetitions and disfluencies, recognition of spontaneous speech requires various new techniques. These new techniques include flexible acoustic modeling, sentence boundary detection, pronunciation modeling, acoustic as well as language model adaptation, and automatic summarization. Particularly automatic summarization including indexing, a process which extracts important and reliable parts of the automatic transcription, is expected to play an important role in building various speech archives, speech-based information retrieval systems, and human-computer dialogue systems.
引用
收藏
页码:366 / 375
页数:10
相关论文
共 35 条
[1]  
AKITA Y, 2003, P IEEE WORKSH SPONT, P79
[2]  
[Anonymous], 2002, Proceedings of the 7th International Conference on Spoken Language Processing, DOI [DOI 10.21437/ICSLP.2002-468, 10.21437/ICSLP.2002-468]
[3]  
[Anonymous], P SSPR 2003
[4]  
BOVES L, 2003, P IEEE WORKSH SPONT, P171
[5]   Are extractive text summarisation techniques portable to broadcast news? [J].
Christensen, H ;
Gotoh, Y ;
Kolluru, B ;
Renals, S .
ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, :489-494
[6]  
Furui S, 2003, PATTERN RECOGNITION IN SPEECH AND LANGUAGE PROCESSING, P191
[7]  
Furui S., 2003, P ISCA IEEE WORKSH S, P1
[8]  
FURUI S, 2004, P INT S LARG SCAL KN, P1
[9]  
HIRSCHBERG J, 2001, P EUR 2001, P2377
[10]   A new approach to automatic speech summarization [J].
Hori, C ;
Furui, S .
IEEE TRANSACTIONS ON MULTIMEDIA, 2003, 5 (03) :368-378