Robust speech recognition with multi-channel codebook dependent cepstral normalization (MCDCN)

被引：0

作者：

Deligne, S ^{[1
]}

Gopinath, R ^{[1
]}

机构：

[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS | 2001年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we address the issue of speech recognition in the presence of interfering signals, in cases where the signals corrupting the speech are recorded in separate channels. We propose to combine a trivial form of filtering with MCDCN, a Multi-channel version of the Codebook Dependent Cepstral Normalization, where the cepstra of the noise are estimated from the reference signals. We report on recognition experiments in a car where the speech signal is corrupted by radio talks or CD music played the car speakers. Our approach allows relative word error rate reductions in the range of 70-90% compared to a no-compensation baseline, at a relatively low computational cost.

引用

页码：151 / 154

页数：4

共 50 条

[31] DEEP BEAMFORMING NETWORKS FOR MULTI-CHANNEL SPEECH RECOGNITION
Xiao, Xiong
Watanabe, Shinji
Erdogan, Hakan
Lu, Liang
Hershey, John
Seltzer, Michael L.
Chen, Guoguo
Zhang, Yu
Mandel, Michael
Yu, Dong
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5745 - 5749
[32] Adaptive channel normalization based on infornax algorithm for robust speech recognition
Jung, Ho-Young
ETRI JOURNAL, 2007, 29 (03) : 300 - 304
[33] Robust automatic speech recognition using a multi-channel signal separation front-end
Yen, KC
Zhao, YX
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1337 - 1340
[34] Extended Powered Cepstral Normalization (P-CN) with Range Equalization for Robust Features in Speech Recognition
Hsu, Chang-wen
Lee, Lin-shan
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2816 - 2819
[35] Extension and Further Analysis of Higher Order Cepstral Moment Normalization (HOCMN) for Robust Features in Speech Recognition
Hsu, Chang-wen
Lee, Lin-shan
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 41 - 44
[36] Comparing Jacorian adaptation with cepstral mean normalization and parallel model combination for noise robust speech recognition
Pärssinen, K
Salmela, P
Harju, M
Kiss, I
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 193 - 196
[37] Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition
Adiga, Aniruddha
Magimai-Doss, Mathew
Seelamantula, Chandra Sekhar
2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
[38] Damped Oscillator Cepstral Coefficients for Robust Speech Recognition
Mitra, Vikramjit
Franco, Horacio
Graciarena, Martin
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 886 - 890
[39] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
Rehr, Robert
Gerkmann, Timo
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
[40] MULTI-CHANNEL SPEECH PROCESSING ARCHITECTURES FOR NOISE ROBUST SPEECH RECOGNITION: 3RD CHIME CHALLENGE RESULTS
Pfeifenberger, Lukas
Schrank, Tobias
Zoehrer, Matthias
Hagmueller, Martin
Pernkopf, Franz
2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 452 - 459

← 1 2 3 4 5 →