ATTENTION-BASED FUSION FOR BONE-CONDUCTED AND AIR-CONDUCTED SPEECH ENHANCEMENT IN THE COMPLEX DOMAIN

被引:5
|
作者
Wang, Heming [1 ]
Zhang, Xueliang [2 ]
Wang, DeLiang [1 ,3 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Inner Mongolia Univ, Dept Comp Sci, Hohhot, Peoples R China
[3] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
关键词
bone conduction; speech enhancement; complex spectral mapping; attention-based fusion;
D O I
10.1109/ICASSP43922.2022.9746374
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Bone-conduction (BC) microphones capture speech signals by converting the vibrations of the human skull into electrical signals. BC sensors are insensitive to acoustic noise, but limited in bandwidth. On the other hand, conventional or air-conduction (AC) microphones are capable of capturing full-band speech, but are susceptible to background noise. We propose to combine the strengths of AC and BC microphones by employing a convolutional recurrent network that performs complex spectral mapping. To better utilize signals from both kinds of microphone, we employ attention-based fusion with early-fusion and late-fusion strategies. Experiments demonstrate the superiority of the proposed method over other recent speech enhancement methods combining BC and AC signals. In addition, our enhancement performance is significantly better than conventional speech enhancement counterparts, especially in low signal-to-noise ratio scenarios.
引用
收藏
页码:7757 / 7761
页数:5
相关论文
共 50 条
  • [1] Amplitude variation of bone-conducted speech compared with air-conducted speech
    Rahman, M. Shahidur
    Shimamura, Tetsuya
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2019, 40 (05) : 293 - 301
  • [2] MULTISENSORY SPEECH ENHANCEMENT IN NOISY ENVIRONMENTS USING BONE-CONDUCTED AND AIR-CONDUCTED MICROPHONES
    Li, Mingzi
    Cohen, Israel
    Mousazadeh, Saman
    2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 1 - 5
  • [3] Bone-Conducted Speech to Air-Conducted Speech Conversion Based on Cycle-Consistent Adversarial Networks
    Pan, Qing
    Zhou, Jian
    Gao, Teng
    Tao, Liang
    2020 IEEE 3RD INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP 2020), 2020, : 168 - 172
  • [4] CLINICAL MASKING OF AIR-CONDUCTED AND BONE-CONDUCTED STIMULI
    STUDEBAKER, GA
    JOURNAL OF SPEECH AND HEARING DISORDERS, 1964, 29 (01): : 23 - 35
  • [5] Fundamental Frequency Estimation Combining Air-Conducted Speech with Bone-Conducted Speech in Noisy Environment
    Zhang, Shiming
    Sugiura, Yosuke
    Shimamura, Tetsuya
    Makinae, Hisanori
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION ENGINEERING (ECCE), 2017, : 244 - 247
  • [6] Comparison of brain magnetic fields evoked by air-conducted sounds, bone-conducted audible sounds, and bone-conducted ultrasounds
    Nakagawa, S
    Nigoro, T
    Yamaguchi, M
    Tonoike, M
    Hosoi, H
    Watanabe, Y
    Imaizumi, S
    NEUROIMAGE, 2001, 13 (06) : S915 - S915
  • [7] PHASE AND INTENSITY RELATIONSHIPS IN THE INTERFERENCE OF BONE-CONDUCTED AND AIR-CONDUCTED SOUND
    DOLCH, JP
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1954, 26 (05): : 942 - 942
  • [8] A lightweight speech enhancement network fusing bone- and air-conducted speech
    Kuang, Kelan
    Yang, Feiran
    Yang, Jun
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2024, 156 (02): : 1355 - 1366
  • [9] The basis for using bone-conducted vibration or air-conducted sound to test otolithic function
    Curthoys, I. S.
    Vulovic, V.
    Burgess, A. M.
    Cornell, E. D.
    Mezey, L. E.
    MacDougall, H. G.
    Manzari, L.
    McGarvie, L. A.
    BASIC AND CLINICAL OCULAR MOTOR AND VESTIBULAR RESEARCH, 2011, 1233 : 231 - 241
  • [10] Air-conducted and bone-conducted speeches combination for noise-robust pitch extraction
    Zhang, Shiming
    Sugiura, Yosuke
    Yasui, Nozomiko
    Shimamura, Tetsuya
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2022, 17 (07) : 1061 - 1071