Optical Microphone-Based Speech Reconstruction System With Deep Learning for Individuals With Hearing Loss

被引:0
作者
Lin, Yu-Min [1 ]
Han, Ji-Yan [1 ]
Lin, Cheng-Hung [2 ]
Lai, Ying-Hui [3 ,4 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Dept Biomed Engn, Taipei, Taiwan
[2] Natl Taiwan Normal Univ, Dept Elect Engn, Taipei, Taiwan
[3] Natl Yang Ming Chiao Tung Univ, Dept Biomed Engn, Taipei 112304, Taiwan
[4] Natl Yang Ming Chiao Tung Univ, Med Device Innovat & Translat Ctr, Taipei 112304, Taiwan
关键词
Deep learning; Lasers; Doppler effect; laser doppler vibrometer; speech enhancement; MEAN-SQUARE ERROR; NEURAL-NETWORKS; ENHANCEMENT; NOISE; INTELLIGIBILITY; AMPLIFICATION; OSCILLATIONS; RECOGNITION; PERCEPTION; FEATURES;
D O I
10.1109/TBME.2023.3285437
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective: Although many speech enhancement (SE) algorithms have been proposed to promote speech perception in hearing-impaired patients, the conventional SE approaches that perform well under quiet and/or stationary noises fail under nonstationary noises and/or when the speaker is at a considerable distance. Therefore, the objective of this study is to overcome the limitations of the conventional speech enhancement approaches. Method: This study proposes a speaker-closed deep learning-based SE method together with an optical microphone to acquire and enhance the speech of a target speaker. Results: The objective evaluation scores achieved by the proposed method outperformed the baseline methods by a margin of 0.21-0.27 and 0.34-0.64 in speech quality (HASQI) and speech comprehension/intelligibility (HASPI), respectively, for seven typical hearing loss types. Conclusion: The results suggest that the proposed method can enhance speech perception by cutting off noise from speech signals and mitigating interference caused by distance. Significance: The results of this study show a potential way that can help improve the listening experience in enhancing speech quality and speech comprehension/intelligibility for hearing-impaired people.
引用
收藏
页码:3330 / 3341
页数:12
相关论文
共 62 条
  • [1] [Anonymous], 2021, WHO, deafness and hearing loss.
  • [2] Transitions in neural oscillations reflect prediction errors generated in audiovisual speech
    Arnal, Luc H.
    Wyart, Valentin
    Giraud, Anne-Lise
    [J]. NATURE NEUROSCIENCE, 2011, 14 (06) : 797 - U164
  • [3] Aubreville M, 2018, INT WORKSH ACOUSTIC, P361, DOI 10.1109/IWAENC.2018.8521369
  • [4] Bimodal speech: early suppressive visual effects in human auditory cortex
    Besle, J
    Fort, A
    Delpuech, C
    Giard, MH
    [J]. EUROPEAN JOURNAL OF NEUROSCIENCE, 2004, 20 (08) : 2225 - 2234
  • [5] Bhattacharya G, 2016, IEEE W SP LANG TECH, P192, DOI 10.1109/SLT.2016.7846264
  • [6] Bronkhorst AW, 2000, ACUSTICA, V86, P117
  • [7] Experimental study of self-oscillations of the trachea-larynx tract by laser doppler vibrometry
    Buccheri, G.
    De Lauro, E.
    De Martino, S.
    Falanga, M.
    [J]. BIOMEDICAL PHYSICS & ENGINEERING EXPRESS, 2016, 2 (05):
  • [8] Enhancing Intelligibility of Dysarthric Speech Using Gated Convolutional-based Voice Conversion System
    Chen, Chen-Yu
    Zheng, Wei-Zhong
    Wang, Syu-Siang
    Tsao, Yu
    Li, Pei-Chun
    Lai, Ying-Hui
    [J]. INTERSPEECH 2020, 2020, : 4686 - 4690
  • [9] An audio-visual corpus for speech perception and automatic speech recognition (L)
    Cooke, Martin
    Barker, Jon
    Cunningham, Stuart
    Shao, Xu
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 120 (05) : 2421 - 2424
  • [10] Dauphin YN, 2017, PR MACH LEARN RES, V70