Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients

被引:0
作者
Mamun, Nursadul [1 ]
Khorram, Soheil [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, Cochlear Implant Proc Lab, Ctr Robust Speech Syst CRSS CILab, Dept Elect & Comp Engn, Richardson, TX 75083 USA
来源
INTERSPEECH 2019 | 2019年
关键词
Speech enhancement; convolutional neural network; cochlear implants; hearing aids; CCi-MOBILE; NOISE;
D O I
10.21437/Interspeech.2019-1850
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Attempts to develop speech enhancement algorithms with improved speech intelligibility for cochlear implant (CI) users have met with limited success. To improve speech enhancement methods for CI users, we propose to perform speech enhancement in a cochlear filter-bank feature space, a feature-set specifically designed for CI users based on CI auditory stimuli. We leverage a convolutional neural network (CNN) to extract both stationary and non-stationary components of environmental acoustics and speech. We propose three CNN architectures: (1) vanilla CNN that directly generates the enhanced signal; (2) spectral-subtraction-style CNN (SS-CNN) that first predicts noise and then generates the enhanced signal by subtracting noise from the noisy signal; (3) Wiener-style CNN (Wiener-CNN) that generates an optimal mask for suppressing noise. An important problem of the proposed networks is that they introduce considerable delays, which limits their real-time application for CI users. To address this, this study also considers causal variations of these networks. Our experiments show that the proposed networks (both causal and non-causal forms) achieve significant improvement over existing baseline systems. We also found that causal Wiener-CNN outperforms other networks, and leads to the best overall envelope coefficient measure (ECM). The proposed algorithms represent a viable option for implementation on the CCi-MOBILE research platform as a pre-processor for CI users in naturalistic environments.
引用
收藏
页码:4265 / 4269
页数:5
相关论文
共 36 条
  • [1] Akter K., 2019, INT C EL COMP COMM E, P1
  • [2] Ali H., 2018, C AC SOC AM, V144, P1872
  • [3] Visually Derived Wiener Filters for Speech Enhancement
    Almajai, Ibrahim
    Milner, Ben
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1642 - 1651
  • [4] [Anonymous], 2019, IEEE T AFFECTIVE COM
  • [5] Bahmaninezhad F, 2018, INTERSPEECH, P1071
  • [6] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
    BOLL, SF
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
  • [7] Dieter A., 1999, ENCY ELECT ELECT ENG, V20, P159
  • [8] Near physiological spectral selectivity of cochlear optogenetics
    Dieter, Alexander
    Duque-Afonso, Carlos J.
    Rankovic, Vladan
    Jeschke, Marcus
    Moser, Tobias
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [9] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR
    EPHRAIM, Y
    MALAH, D
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02): : 443 - 445
  • [10] A SIGNAL SUBSPACE APPROACH FOR SPEECH ENHANCEMENT
    EPHRAIM, Y
    VANTREES, HL
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (04): : 251 - 266