Closely coupled array processing and model-based compensation for microphone array speech recognition

被引:12
|
作者
Zhao, Xianyu [1 ]
Ou, Zhijian [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
array signal processing; microphone array; model-based compensation; robust speech recognition;
D O I
10.1109/TASL.2006.881673
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In conventional microphone array speech recognition, the array processor and the speech recognizer are loosely coupled. The only connection between the two modules is the en hanced target signal output from the array processor, which then gets treated as a single input to. the recognizer. In this approach, useful environmental information, which can be provided by the array processor and also needs to be exploited by the recognizer, is ignored. Inherently, the array processor can generate multiple outputs of spatially filtered signals, as a multi-input-multi-output (MIMO) module. In this paper, a closely coupled approach is proposed, in which a recognizer with model-based noise compensation exploits the reference noise outputs from a MIMO array processor. Specifically, a multichannel model-based noise compensation is presented, including the compensation procedure using the vector Taylor series (VTS) expansion and parameter estimation using the expectation-maximization (EM) algorithm. It is also shown how to construct MIMO array processors from conventional beamformers. A number of practical implementations of the conventional loosely coupled approach and the proposed closely coupled approach were tested on a publicly available database, the Multichannel Overlapping Number Corpus (MONC). Experimental results showed that the proposed closely coupled approach significantly improved the speech recognition performance in the overlapping speech situations.
引用
收藏
页码:1114 / 1122
页数:9
相关论文
共 50 条
  • [21] Distant Speech Recognition Using a Microphone Array Network
    Nakano, Alberto Yoshihiro
    Nakagawa, Seiichi
    Yamamoto, Kazumasa
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2451 - 2462
  • [22] COMBINING CEPSTRAL NORMALIZATION AND COCHLEAR IMPLANT-LIKE SPEECH PROCESSING FOR MICROPHONE ARRAY-BASED SPEECH RECOGNITION
    Cong-Thanh Do
    Taghizadeh, Mohammad J.
    Garner, Philip N.
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 137 - 142
  • [23] Modern microphone array for hearing aid and speech processing
    Wang, A
    Yao, K
    Hudson, RE
    Korompis, D
    Lorenzelli, F
    Soli, SD
    Gao, S
    ADVANCED SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, AND IMPLEMENTATIONS VI, 1996, 2846 : 112 - 121
  • [24] Adaptive Microphone Array Processing for High-Performance Speech Recognition in Car Environment
    Hong, Jungpyo
    Han, Seungho
    Jeong, Sangbae
    Hahn, Minsoo
    IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE 2011), 2011, : 829 - +
  • [25] Adaptive Microphone Array Processing for High-Performance Speech Recognition in Car Environment
    Hong, Jungpyo
    Han, Seungho
    Jeong, Sangbae
    Hahn, Minsoo
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2011, 57 (01) : 260 - 266
  • [26] Microphone Array Processing for Distant Speech Recognition: Towards Real-World Deployment
    Kumatani, Kenichi
    Arakawa, Takayuki
    Yamamoto, Kazumasa
    McDonough, John
    Raj, Bhiksha
    Singh, Rita
    Tashev, Ivan
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [27] Model-based array processing in a fading channel
    Sullivan, EJ
    OCEANS '97 MTS/IEEE CONFERENCE PROCEEDINGS, VOLS 1 AND 2, 1997, : 779 - 783
  • [28] Model-based processing for a short towed array
    Sullivan, Edmund J.
    Holmes, Jason D.
    Carey, William M.
    NSSPW: NONLINEAR STATISTICAL SIGNAL PROCESSING WORKSHOP: CLASSICAL, UNSCENTED AND PARTICLE FILTERING METHODS, 2006, : 41 - +
  • [29] Template-based Spectral Estimation Using Microphone Array for Speech Recognition
    Tamura, Satoshi
    Hishikawa, Eriko
    Taguchi, Wataru
    Hayamizu, Satoru
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2050 - +
  • [30] A New Microphone Array Speech Enhancement Method Based on AR Model
    Zhang, Liyan
    Yin, Fuliang
    Zhang, Lijun
    LIFE SYSTEM MODELING AND INTELLIGENT COMPUTING, 2010, 6330 : 139 - +