Closely coupled array processing and model-based compensation for microphone array speech recognition

被引：12

作者：

Zhao, Xianyu ^{[1
]}

Ou, Zhijian ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 03期

基金：

中国国家自然科学基金;

关键词：

array signal processing; microphone array; model-based compensation; robust speech recognition;

D O I：

10.1109/TASL.2006.881673

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In conventional microphone array speech recognition, the array processor and the speech recognizer are loosely coupled. The only connection between the two modules is the en hanced target signal output from the array processor, which then gets treated as a single input to. the recognizer. In this approach, useful environmental information, which can be provided by the array processor and also needs to be exploited by the recognizer, is ignored. Inherently, the array processor can generate multiple outputs of spatially filtered signals, as a multi-input-multi-output (MIMO) module. In this paper, a closely coupled approach is proposed, in which a recognizer with model-based noise compensation exploits the reference noise outputs from a MIMO array processor. Specifically, a multichannel model-based noise compensation is presented, including the compensation procedure using the vector Taylor series (VTS) expansion and parameter estimation using the expectation-maximization (EM) algorithm. It is also shown how to construct MIMO array processors from conventional beamformers. A number of practical implementations of the conventional loosely coupled approach and the proposed closely coupled approach were tested on a publicly available database, the Multichannel Overlapping Number Corpus (MONC). Experimental results showed that the proposed closely coupled approach significantly improved the speech recognition performance in the overlapping speech situations.

引用

页码：1114 / 1122

页数：9

共 50 条

[21] Distant Speech Recognition Using a Microphone Array Network
Nakano, Alberto Yoshihiro
Nakagawa, Seiichi
Yamamoto, Kazumasa
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2451 - 2462
[22] COMBINING CEPSTRAL NORMALIZATION AND COCHLEAR IMPLANT-LIKE SPEECH PROCESSING FOR MICROPHONE ARRAY-BASED SPEECH RECOGNITION
Cong-Thanh Do
Taghizadeh, Mohammad J.
Garner, Philip N.
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 137 - 142
[23] Modern microphone array for hearing aid and speech processing
Wang, A
Yao, K
Hudson, RE
Korompis, D
Lorenzelli, F
Soli, SD
Gao, S
ADVANCED SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, AND IMPLEMENTATIONS VI, 1996, 2846 : 112 - 121
[24] Adaptive Microphone Array Processing for High-Performance Speech Recognition in Car Environment
Hong, Jungpyo
Han, Seungho
Jeong, Sangbae
Hahn, Minsoo
IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE 2011), 2011, : 829 - +
[25] Adaptive Microphone Array Processing for High-Performance Speech Recognition in Car Environment
Hong, Jungpyo
Han, Seungho
Jeong, Sangbae
Hahn, Minsoo
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2011, 57 (01) : 260 - 266
[26] Microphone Array Processing for Distant Speech Recognition: Towards Real-World Deployment
Kumatani, Kenichi
Arakawa, Takayuki
Yamamoto, Kazumasa
McDonough, John
Raj, Bhiksha
Singh, Rita
Tashev, Ivan
2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
[27] Model-based array processing in a fading channel
Sullivan, EJ
OCEANS '97 MTS/IEEE CONFERENCE PROCEEDINGS, VOLS 1 AND 2, 1997, : 779 - 783
[28] Model-based processing for a short towed array
Sullivan, Edmund J.
Holmes, Jason D.
Carey, William M.
NSSPW: NONLINEAR STATISTICAL SIGNAL PROCESSING WORKSHOP: CLASSICAL, UNSCENTED AND PARTICLE FILTERING METHODS, 2006, : 41 - +
[29] Template-based Spectral Estimation Using Microphone Array for Speech Recognition
Tamura, Satoshi
Hishikawa, Eriko
Taguchi, Wataru
Hayamizu, Satoru
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2050 - +
[30] A New Microphone Array Speech Enhancement Method Based on AR Model
Zhang, Liyan
Yin, Fuliang
Zhang, Lijun
LIFE SYSTEM MODELING AND INTELLIGENT COMPUTING, 2010, 6330 : 139 - +

← 1 2 3 4 5 →