ATTENTION-BASED FUSION FOR BONE-CONDUCTED AND AIR-CONDUCTED SPEECH ENHANCEMENT IN THE COMPLEX DOMAIN

被引:5
|
作者
Wang, Heming [1 ]
Zhang, Xueliang [2 ]
Wang, DeLiang [1 ,3 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Inner Mongolia Univ, Dept Comp Sci, Hohhot, Peoples R China
[3] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
关键词
bone conduction; speech enhancement; complex spectral mapping; attention-based fusion;
D O I
10.1109/ICASSP43922.2022.9746374
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Bone-conduction (BC) microphones capture speech signals by converting the vibrations of the human skull into electrical signals. BC sensors are insensitive to acoustic noise, but limited in bandwidth. On the other hand, conventional or air-conduction (AC) microphones are capable of capturing full-band speech, but are susceptible to background noise. We propose to combine the strengths of AC and BC microphones by employing a convolutional recurrent network that performs complex spectral mapping. To better utilize signals from both kinds of microphone, we employ attention-based fusion with early-fusion and late-fusion strategies. Experiments demonstrate the superiority of the proposed method over other recent speech enhancement methods combining BC and AC signals. In addition, our enhancement performance is significantly better than conventional speech enhancement counterparts, especially in low signal-to-noise ratio scenarios.
引用
收藏
页码:7757 / 7761
页数:5
相关论文
共 50 条
  • [21] Speaker-Independent Spectral Enhancement for Bone-Conducted Speech
    Cheng, Liangliang
    Dou, Yunfeng
    Zhou, Jian
    Wang, Huabin
    Tao, Liang
    ALGORITHMS, 2023, 16 (03)
  • [22] A COMPARISON OF AUDITORY BRAIN-STEM RESPONSE THRESHOLDS AND LATENCIES ELICITED BY AIR-CONDUCTED AND BONE-CONDUCTED STIMULI
    GORGA, MP
    KAMINSKI, JR
    BEAUCHAINE, KL
    BERGMAN, BM
    EAR AND HEARING, 1993, 14 (02): : 85 - 94
  • [23] Bone-conducted speech enhancement using deep denoising autoencoder
    Liu, Hung-Ping
    Tsao, Yu
    Fuh, Chiou-Shann
    SPEECH COMMUNICATION, 2018, 104 : 106 - 112
  • [24] Quality improvement of bone-conducted speech
    Shimamura, T
    Tomikura, T
    PROCEEDINGS OF THE 2005 EUROPEAN CONFERENCE ON CIRCUIT THEORY AND DESIGN, VOL 3, 2005, : 73 - 76
  • [25] Intelligibility of bone-conducted ultrasonic speech
    Okamoto, Y
    Nakagawa, S
    Fujimoto, K
    Tonoike, M
    HEARING RESEARCH, 2005, 208 (1-2) : 107 - 113
  • [26] AUDITORY BRAIN-STEM RESPONSES TO AIR-CONDUCTED AND BONE-CONDUCTED CLICKS IN THE AUDIOLOGICAL ASSESSMENT OF AT-RISK INFANTS
    YANG, EY
    STUART, A
    MENCHER, GT
    MENCHER, LS
    VINCER, MJ
    EAR AND HEARING, 1993, 14 (03): : 175 - 182
  • [27] COMPARISON BETWEEN MIDDLE-EAR MUSCLE REFLEX THRESHOLDS FOR BONE-CONDUCTED AND AIR-CONDUCTED PURE-TONES
    DJUPESLAND, G
    FLOTTORP, G
    SUNDBY, A
    SZALAY, M
    ACTA OTO-LARYNGOLOGICA, 1973, 75 (2-3) : 178 - 183
  • [28] Comparison of cervical vestibular evoked potentials evoked by air-conducted sound and bone-conducted vibration in vestibular Schwannoma patients
    Ogawa, Yasuo
    Otsuka, Koji
    Inagaki, Taro
    Nagai, Noriko
    Itani, Shigeto
    Kondo, Takahito
    Kohno, Michihiro
    Suzuki, Mamoru
    ACTA OTO-LARYNGOLOGICA, 2018, 138 (10) : 898 - 903
  • [29] Spectra Restoration of Bone-Conducted Speech via Attention-Based Contextual Information and Spectro-Temporal Structure Constraint
    Zheng, Changyan
    Cao, Tieyong
    Yang, Jibin
    Zhang, Xiongwei
    Sun, Meng
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2019, E102A (12) : 2001 - 2007
  • [30] Bone-conducted speech enhancement using WaveNet fused with phase information
    Zheng, Changyan
    Yang, Jibin
    Zhang, Xiongwei
    Sun, Meng
    Shengxue Xuebao/Acta Acustica, 2021, 46 (02): : 309 - 320