Robust speech recognition with multi-channel codebook dependent cepstral normalization (MCDCN)

被引:0
作者
Deligne, S [1 ]
Gopinath, R [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
来源
ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS | 2001年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the issue of speech recognition in the presence of interfering signals, in cases where the signals corrupting the speech are recorded in separate channels. We propose to combine a trivial form of filtering with MCDCN, a Multi-channel version of the Codebook Dependent Cepstral Normalization, where the cepstra of the noise are estimated from the reference signals. We report on recognition experiments in a car where the speech signal is corrupted by radio talks or CD music played the car speakers. Our approach allows relative word error rate reductions in the range of 70-90% compared to a no-compensation baseline, at a relatively low computational cost.
引用
收藏
页码:151 / 154
页数:4
相关论文
共 50 条
  • [41] SPEAKER ADAPTED BEAMFORMING FOR MULTI-CHANNEL AUTOMATIC SPEECH RECOGNITION
    Menne, Tobias
    Schlueter, Ralf
    Ney, Hermann
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 535 - 541
  • [42] END-TO-END MULTI-CHANNEL TRANSFORMER FOR SPEECH RECOGNITION
    Chang, Feng-Ju
    Radfar, Martin
    Mouchtaris, Athanasios
    King, Brian
    Kunzmann, Siegfried
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5884 - 5888
  • [43] Robust Speech Recognition Combining Cepstral and Articulatory Features
    Zha, Zhuan-ling
    Hu, Jin
    Zhan, Qing-ran
    Shan, Ya-hui
    Xie, Xiang
    Wang, Jing
    Cheng, Hao-bo
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1401 - 1405
  • [44] Multi-channel Attention for End-to-End Speech Recognition
    Braun, Stefan
    Neil, Daniel
    Anumula, Jithendar
    Ceolini, Enea
    Liu, Shih-Chii
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 17 - 21
  • [45] Audio-visual Multi-channel Recognition of Overlapped Speech
    Yu, Jianwei
    Wu, Bo
    Gu, Rongzhi
    Zhang, Shi-Xiong
    Chen, Lianwu
    Xu, Yong
    Yu, Meng
    Su, Dan
    Yu, Dong
    Liu, Xunying
    Meng, Helen
    INTERSPEECH 2020, 2020, : 3496 - 3500
  • [46] The segmentation of multi-channel meeting recordings for automatic speech recognition
    Dines, John
    Vepa, Jithendra
    Hain, Thomas
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1213 - +
  • [47] Quaternion Neural Networks for Multi-channel Distant Speech Recognition
    Qiu, Xinchi
    Parcollet, Titouan
    Ravanelli, Mirco
    Lane, Nicholas D.
    Morchid, Mohamed
    INTERSPEECH 2020, 2020, : 329 - 333
  • [48] MULTI-CHANNEL OVERLAPPED SPEECH RECOGNITION WITH LOCATION GUIDED SPEECH EXTRACTION NETWORK
    Chen, Zhuo
    Xiao, Xiong
    Yoshioka, Takuya
    Erdogan, Hakan
    Li, Jinyu
    Gong, Yifan
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 558 - 565
  • [49] Multi-channel Iterative Dereverberation based on Codebook Constrained Iterative Multi-channel Wiener Filter
    Ajay, S.
    Sreenivas, T. V.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 981 - 984
  • [50] A GENERATIVE-DISCRIMINATIVE HYBRID APPROACH TO MULTI-CHANNEL NOISE REDUCTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Mentzner, Hendrik
    Araki, Shoko
    Fujimoto, Masakiyo
    Nakatani, Totohiro
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5740 - 5744