Closely coupled array processing and model-based compensation for microphone array speech recognition

被引:12
|
作者
Zhao, Xianyu [1 ]
Ou, Zhijian [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
array signal processing; microphone array; model-based compensation; robust speech recognition;
D O I
10.1109/TASL.2006.881673
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In conventional microphone array speech recognition, the array processor and the speech recognizer are loosely coupled. The only connection between the two modules is the en hanced target signal output from the array processor, which then gets treated as a single input to. the recognizer. In this approach, useful environmental information, which can be provided by the array processor and also needs to be exploited by the recognizer, is ignored. Inherently, the array processor can generate multiple outputs of spatially filtered signals, as a multi-input-multi-output (MIMO) module. In this paper, a closely coupled approach is proposed, in which a recognizer with model-based noise compensation exploits the reference noise outputs from a MIMO array processor. Specifically, a multichannel model-based noise compensation is presented, including the compensation procedure using the vector Taylor series (VTS) expansion and parameter estimation using the expectation-maximization (EM) algorithm. It is also shown how to construct MIMO array processors from conventional beamformers. A number of practical implementations of the conventional loosely coupled approach and the proposed closely coupled approach were tested on a publicly available database, the Multichannel Overlapping Number Corpus (MONC). Experimental results showed that the proposed closely coupled approach significantly improved the speech recognition performance in the overlapping speech situations.
引用
收藏
页码:1114 / 1122
页数:9
相关论文
共 50 条
  • [1] Closely coupled array processing and model-based compensation for microphone array speech recognition
    Zhao, XY
    Ou, ZJ
    Che, MH
    Wang, ZY
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 417 - 420
  • [2] Microphone Array Processing for Distant Speech Recognition
    Kumatani, Kenichi
    McDonough, John
    Raj, Bhiksha
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 127 - 140
  • [3] Model-Based Post Filter for Microphone Array Speech Enhancement
    Xiong, Yan
    Chen, Qiang
    Deng, Shuxia
    Liang, Sheng
    Wang, Kailian
    Zhang, Jun
    Wang, Jie
    2018 7TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH 2018), 2018, : 82 - 88
  • [4] Microphone Array Speech Processing
    Nordholm, Sven
    Abhayapala, Thushara
    Doclo, Simon
    Gannot, Sharon
    Naylor, Patrick
    Tashev, Ivan
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2010,
  • [5] Microphone Array Speech Processing
    Sven Nordholm
    ThusharaD Abhayapala
    Simon Doclo
    Sharon Gannot
    P Naylor
    Ivan Tashev
    EURASIP Journal on Advances in Signal Processing, 2010
  • [6] Microphone Array Processing for Distant Speech Recognition: Spherical Arrays
    McDonough, John
    Kumatani, Kenichi
    Raj, Bhiksha
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [7] Microphone Array Processing Strategies for Distant-Based Automatic Speech Recognition
    Khoubrouy, Soudeh A.
    Hansen, John H. L.
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (10) : 1344 - 1348
  • [8] HMM adaptation and microphone array processing for distant speech recognition
    Kleban, J
    Gong, YF
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1411 - 1414
  • [9] A Posterior Approach for Microphone Array Based Speech Recognition
    Wang, Dong
    Himawan, Ivan
    Frankel, Joe
    King, Simon
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 996 - 999
  • [10] Microphone array system for speech recognition
    Kiyohara, K
    Kaneda, Y
    Takahashi, S
    Nomura, H
    Kojima, J
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS, 1997, : 215 - 218