Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments - Newest Part of the CENSREC Series -

被引:0
作者
Nishiura, Takanobu
Nakayama, Masato
Denda, Yuki
Kitaoka, Norihide
Yamamoto, Kazumasa
Yamada, Takeshi
Tsuge, Satoru
Miyajima, Chiyomi
Fujimoto, Masakiyo
Takiguchi, Tetsuya
Tamura, Satoshi
Kuroiwa, Shingo
Takeda, Kazuya
Nakamura, Satoshi
机构
来源
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 | 2008年
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Recently, speech recognition performance has been drastically improved by statistical methods and huge speech databases. Now performance improvement under such realistic environments as noisy conditions is being focused on. Since October 2001, we from the working group of the Information Processing Society in Japan have been working on evaluation methodologies and frameworks for Japanese noisy speech recognition. We have released frameworks including databases and evaluation tools called CENSREC-1 (Corpus and Environment for Noisy Speech RECognition 1; formerly AURORA-2J), CENSREC-2 (in-car connected digits recognition), CENSREC-3 (in-car isolated word recognition), and CENSREC-1-C (voice activity detection under noisy conditions). In this paper, we newly introduce a collection of databases and evaluation tools named CENSREC-4, which is an evaluation framework for distant-talking speech under hands-free conditions. Distant-talking speech recognition is crucial for a hands-free speech interface. Therefore, we measured room impulse responses to investigate reverberant speech recognition. The results of evaluation experiments proved that CENSREC-4 is an effective database suitable for evaluating the new dereverberation method because the traditional dereverberation process had difficulty sufficiently improving the recognition performance. The framework was released in March 2008, and many studies are being conducted with it in Japan.
引用
收藏
页码:1828 / 1834
页数:7
相关论文
共 7 条
[1]   CENSREC-3: An evaluation framework for Japanese speech recognition in real car-driving environments [J].
Fujimoto, Masakiyo ;
Takeda, Kazuya ;
Nakamura, Satoshi .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (11) :2783-2793
[2]   CEPSTRAL ANALYSIS TECHNIQUE FOR AUTOMATIC SPEAKER VERIFICATION [J].
FURUI, S .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1981, 29 (02) :254-272
[3]  
HIRSH HG, 2000, ISCA ITRW ASR2000
[4]  
Kitaoka N, 2007, 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, P607
[5]   AURORA-2J: An evaluation framework for Japanese noisy speech recognition [J].
Nakamura, S ;
Takeda, K ;
Yamamoto, K ;
Yamada, T ;
Kuroiwa, S ;
Kitakoka, N ;
Nishiura, T ;
Sasou, A ;
Mizumachi, M ;
Miyajima, C ;
Fujimoto, M ;
Endo, T .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03) :535-544
[6]  
Nakamura S, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P2330
[7]   AN OPTIMUM COMPUTER-GENERATED PULSE SIGNAL SUITABLE FOR THE MEASUREMENT OF VERY LONG IMPULSE RESPONSES [J].
SUZUKI, Y ;
ASANO, F ;
KIM, HY ;
SONE, T .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 97 (02) :1119-1123