Robust speech recognition system for communication robots in real environments

被引:7
|
作者
Ishi, Carlos Toshinori [1 ]
Matsuda, Shigeki [2 ]
Kanda, Takayuki [1 ]
Jitsuhiro, Takatoshi [3 ]
Ishiguro, Hiroshi [1 ]
Nakamura, Satoshi [2 ]
Hagita, Norihiro [1 ]
机构
[1] ATR, Intelligent Robot & Commun Labs, Kyoto, Japan
[2] Natl Inst Informat & Commun Technol, Spoken Language Commun Res Lab, ATR, Kyoto, Japan
[3] Knowledge Sci Lab, ATR, Kyoto, Japan
关键词
communication robots; speech recognition; robustness; acoustic noise; children speech;
D O I
10.1109/ICHR.2006.321294
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The application range of communication robots could be widely expanded by the use of an automatic speech recognition (ASR) system with improved robustness for noise and for speakers of different ages. In this paper, we describe an ASR system which can robustly recognize speech by adults and children in noisy environments. We evaluate the ASR system in a communication robot placed in a real noisy environment. Speech is captured using a twelve-element microphone array arranged in the robot chest. To suppress interference and noise and to attenuate reverberation, we implemented a multi-channel system consisting of an outlier-robust generalized sidelobe canceller (RGSC) technique and a feature-space noise suppression using MMSE criteria. Speech activity periods are detected using GMM-based end-point detection (GMM-EPD). Our ASR system has two decoders for adults' and children's speech. The final hypothesis is selected based on posterior probability. We then assign a generalized word posterior probability (GWPP)-based confidence measure to this hypothesis, and if it is higher than a threshold, we transfer it to a subsequent dialog processing module. The performance of each step was evaluated for adults' and children's speech, by adding different levels of real environment noise recorded in a cafeteria. Experimental results indicated that our ASR system could achieve over 80 % word accuracy in 70 dBA noise. Further evaluation of adult speech recorded in a real noisy environment resulted in 73 % word accuracy.
引用
收藏
页码:340 / +
页数:2
相关论文
共 50 条
  • [21] Robust coherence-based spectral enhancement for speech recognition in adverse real-world environments
    Barfuss, Hendrik
    Huemmer, Christian
    Schwarz, Andreas
    Kellermann, Walter
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 388 - 400
  • [22] Linearized distortion model for robust speech recognition in noisy environments
    He, Yong-Jun
    Han, Ji-Qing
    Tongxin Xuebao/Journal on Communications, 2010, 31 (09): : 8 - 14
  • [23] A new framework for robust speech recognition in complex channel environments
    He, Yongjun
    Han, Jiqing
    Zheng, Tieran
    Sun, Guanglu
    DIGITAL SIGNAL PROCESSING, 2014, 32 : 109 - 123
  • [24] Robust speech recognition using factorial HMMs for home environments
    Betkowska, Agnieszka
    Shinoda, Koichi
    Furui, Sadaoki
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2007, 2007 (1)
  • [25] Robust Speech Recognition Using Factorial HMMs for Home Environments
    Agnieszka Betkowska
    Koichi Shinoda
    Sadaoki Furui
    EURASIP Journal on Advances in Signal Processing, 2007
  • [26] Robust Automatic Speech Recognition for Accented Mandarin in Car Environments
    Pei Ding
    Lei He
    Xiang Yan
    Jie Hao
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2542 - 2545
  • [27] Robust Feature Extraction Methods for Speech Recognition in Noisy Environments
    Mukheolkar, Ajinkya Sunil
    Alex, John Sahaya Rani
    2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 295 - 299
  • [28] A robust feature extraction for automatic speech recognition in noisy environments
    Lima, C
    Almeida, LB
    Monteiro, JL
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 540 - 543
  • [29] Which French speech recognition system for assistant robots?
    Fadel, Wiam
    Araf, Imane
    Bouchentouf, Toumi
    Buvet, Pierre-Andre
    Bourzeix, Francois
    Bourja, Omar
    2022 2ND INTERNATIONAL CONFERENCE ON INNOVATIVE RESEARCH IN APPLIED SCIENCE, ENGINEERING AND TECHNOLOGY (IRASET'2022), 2022, : 675 - 679
  • [30] An On-device Robust Sound Recognition System for Real-time Context Awareness of Robots
    Song, Ju-man
    Kim, Changmin
    Son, Jungkwan
    2024 33RD IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, ROMAN 2024, 2024, : 2212 - 2218