SPATIAL DIFFUSENESS FEATURES FOR DNN-BASED SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS

被引:0
|
作者
Schwarz, Andreas [1 ]
Huemmer, Christian [1 ]
Maas, Roland [1 ]
Kellermann, Walter [1 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg FAU, Multimedia Commun & Signal Proc, Cauerstr 7, D-91058 Erlangen, Germany
关键词
Speech Recognition; Reverberation; Diffuse Noise; Deep Neural Networks; PERCEPTION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a spatial diffuseness feature for deep neural network (DNN)-based automatic speech recognition to improve recognition accuracy in reverberant and noisy environments. The feature is computed in real-time from multiple microphone signals without requiring knowledge or estimation of the direction of arrival, and represents the relative amount of diffuse noise in each time and frequency bin. It is shown that using the diffuseness feature as an additional input to a DNN-based acoustic model leads to a reduced word error rate for the REVERB challenge corpus, both compared to logmelspec features extracted from noisy signals, and features enhanced by spectral subtraction.
引用
收藏
页码:4380 / 4384
页数:5
相关论文
共 50 条
  • [21] Comparing Fusion Models for DNN-Based Audiovisual Continuous Speech Recognition
    Abdelaziz, Ahmed Hussen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (03) : 475 - 484
  • [22] DNN-Based Speech Synthesis: Importance of Input Features and Training Data
    Lazaridis, Alexandros
    Potard, Blaise
    Garner, Philip N.
    SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 193 - 200
  • [23] DNN-Based Arabic Speech Synthesis
    Amrouche, Aissa
    Bentrcia, Youssouf
    Boubakeur, Khadidja Nesrine
    Abed, Ahcene
    2022 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2022), 2022, : 378 - 382
  • [24] Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
    Fujimura, Takuya
    Koizumi, Yuma
    Yatabe, Kohei
    Miyazaki, Ryoichi
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 436 - 440
  • [25] Speech intelligibility improvement in noisy reverberant environments based on speech enhancement and inverse filtering
    Dong, Huan-Yu
    Lee, Chang-Myung
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,
  • [26] Speech intelligibility improvement in noisy reverberant environments based on speech enhancement and inverse filtering
    Huan-Yu Dong
    Chang-Myung Lee
    EURASIP Journal on Audio, Speech, and Music Processing, 2018
  • [27] Chinese speech intelligibility of children in noisy and reverberant environments
    Peng, Jianxin
    Wu, Shengju
    INDOOR AND BUILT ENVIRONMENT, 2018, 27 (10) : 1357 - 1363
  • [28] An Investigation into Audiovisual Speech Correlation in Reverberant Noisy Environments
    Cifani, Simone
    Abel, Andrew
    Hussain, Amir
    Squartini, Stefano
    Piazza, Francesco
    CROSS-MODAL ANALYSIS OF SPEECH, GESTURES, GAZE AND FACIAL EXPRESSIONS, 2009, 5641 : 331 - +
  • [29] TDOA ESTIMATION OF SPEECH SOURCE IN NOISY REVERBERANT ENVIRONMENTS
    Bu, Suliang
    Zhao, Tuo
    Zhao, Yunxin
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1059 - 1066
  • [30] DNN-Based PolSAR Image Classification on Noisy Labels
    Ni, Jun
    Xiang, Deliang
    Lin, Zhiyuan
    Lopez-Martinez, Carlos
    Hu, Wei
    Zhang, Fan
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 3697 - 3713