A multi-channel speech enhancement framework for robust NMF-based speech recognition for speech-impaired users

被引:0
|
作者
Dekkers, Gert [1 ,2 ,4 ]
van Waterschoot, Toon [1 ,2 ]
Vanrumste, Bart [1 ,2 ,4 ]
Van Den Broeck, Bert [1 ,2 ,4 ]
Gemmeke, Jort F. [3 ]
Van Hamme, Hugo [3 ]
Karsmakers, Peter [1 ,2 ,4 ]
机构
[1] KU Leuven TC Geel, ESAT ETC AdvISe, Kleinhoefstr 4, B-2440 Geel, Belgium
[2] Katholieke Univ Leuven, ESAT STADIUS, B-3001 Leuven, Belgium
[3] Katholieke Univ Leuven, ESAT PSI, B-3001 Leuven, Belgium
[4] iMinds, Med IT, B-3001 Leuven, Belgium
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
multi-channel speech enhancement; speech recognition; uncertainty of estimation; dysarthric speech; INTEGRATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper a multi-channel speech enhancement framework for distant speech acquisition in noisy and reverberant environments for Non-negative Matrix Factorization (NMF)-based Automatic Speech Recognition (ASR) is proposed. The system is evaluated for its use in an assistive vocal interface for physically impaired and speech-impaired users. The framework utilises the Spatially Pre-processed Speech Distortion Weighted Multi-channel Wiener Filter (SP-SDW-MWF) in combination with a postfilter to reduce noise and reverberation. Additionally, the estimation uncertainty of the speech enhancement framework is propagated through the Mel-Frequency Cepstrum Coefficients (MFCC) feature extraction to allow for feature compensation in a later stage. Results indicate that a) using a trade-off parameter between noise reduction and speech distortion has a positive effect on the recognition performance with respect to the well-known GSC and MWF and b) the addition of a post filter and the feature compensation increases performance with respect to several baselines for a non-pathological and pathological speaker.
引用
收藏
页码:746 / 750
页数:5
相关论文
共 50 条
  • [1] A SUPERVISED MULTI-CHANNEL SPEECH ENHANCEMENT ALGORITHM BASED ON BAYESIAN NMF MODEL
    Chung, Hanwook
    Plourde, Eric
    Champagne, Benoit
    2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018), 2018, : 221 - 225
  • [2] Nanowire Strain Sensor-based Word Recognition for Speech-impaired Users
    Zhang, Eric
    Wu, Shuang
    Shen, Alex
    Syed, Zahid
    Zhu, Yong
    Shen, Xipeng
    2024 IEEE 24TH INTERNATIONAL CONFERENCE ON NANOTECHNOLOGY, NANO 2024, 2024, : 539 - 544
  • [3] Eigenvector-Based Speech Mask Estimation for Multi-Channel Speech Enhancement
    Pfeifenberger, Lukas
    Zoehrer, Matthias
    Pernkopf, Franz
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2162 - 2172
  • [4] Semantic Enhancement Framework for Robust Speech Recognition
    Yang, Baochen
    Yu, Kai
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2022, 2023, 1765 : 81 - 88
  • [5] Multi-Channel Transformer Transducer for Speech Recognition
    Chang, Feng-Ju
    Radfar, Martin
    Mouchtaris, Athanasios
    Omologo, Maurizio
    INTERSPEECH 2021, 2021, : 296 - 300
  • [6] MULTI-CHANNEL OVERLAPPED SPEECH RECOGNITION WITH LOCATION GUIDED SPEECH EXTRACTION NETWORK
    Chen, Zhuo
    Xiao, Xiong
    Yoshioka, Takuya
    Erdogan, Hakan
    Li, Jinyu
    Gong, Yifan
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 558 - 565
  • [7] Deep Neural Network-Based Generalized Sidelobe Canceller for Robust Multi-channel Speech Recognition
    Li, Guanjun
    Liang, Shan
    Nie, Shuai
    Liu, Wenju
    Yang, Zhanlei
    Xiao, Longshuai
    INTERSPEECH 2020, 2020, : 51 - 55
  • [8] Dual channel based speech enhancement using novelty filter for robust speech recognition in automobile environment
    Beh, Jounghoon
    Baran, Robert H.
    Ko, Hanseok
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2006, 52 (02) : 583 - 589
  • [9] Environmental robust speech and speaker recognition through multi-channel histogram equalization
    Squartini, Stefano
    Principi, Emanuele
    Rotili, Rudy
    Piazza, Francesco
    NEUROCOMPUTING, 2012, 78 (01) : 111 - 120
  • [10] A Feature Integration Network for Multi-Channel Speech Enhancement
    Zeng, Xiao
    Zhang, Xue
    Wang, Mingjiang
    SENSORS, 2024, 24 (22)