A lightweight speech enhancement network fusing bone- and air-conducted speech

被引:0
|
作者
Kuang, Kelan [1 ,2 ]
Yang, Feiran [3 ]
Yang, Jun [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Noise & Vibrat Res, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Chinese Acad Sci, Inst Acoust, State Key Lab Acoust, Beijing 100190, Peoples R China
来源
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
NOISE; INTELLIGIBILITY; RESTORATION; ALGORITHM;
D O I
10.1121/10.0028339
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Air-conducted (AC) microphones capture the high-quality desired speech and ambient noise, whereas bone-conducted (BC) microphones are immune to ambient noise but only capture band limited speech. This paper proposes a speech enhancement model that leverages the merits of BC and AC speech. The proposed model takes the spectrogram of BC and AC speech as input and fuses them by an attention-based feature fusion module. The backbone network of the proposed model uses the fused signals to estimate mask of the target speech, which is then applied to the noisy AC speech to recover the target speech. The proposed model adopts a lightweight design of densely gated convolutional attention network (DenGCAN) as the backbone network, which contains encoder, bottleneck layers, and decoder. Furthermore, this paper improves an attention gate and integrates it into skip-connections of DenGCAN, which allows the decoder to focus on the key areas of the feature map extracted by the encoder. As the DenGCAN adopts self-attention mechanism, the proposed model has the potential to improve noise reduction performance at the expense of an increased input-output latency. Experimental results demonstrate that the enhanced speech of the proposed model achieves an average 1.870 wideband-PESQ improvement over the noisy AC speech.
引用
收藏
页码:1355 / 1366
页数:12
相关论文
共 50 条
  • [1] Speech enhancement using bone- and air-conducted signals and adaptive GFLANN filter
    Xiao, Ran
    Xiao, Yegui
    Wei, Hongyun
    Hasegawa, Koji
    2016 8TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS & SIGNAL PROCESSING (WCSP), 2016,
  • [2] Speech enhancement based on FLANN using both bone- and air-conducted measurements
    Huang, Boyan
    Xiao, Yegui
    Sun, Jinwei
    Wei, Guo
    Wei, Hongyun
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [3] Noise suppression method by jointly using bone- and air-conducted speech signals
    Ikuta, Akira
    Orimoto, Hisako
    Gallagher, Gerard
    NOISE CONTROL ENGINEERING JOURNAL, 2018, 66 (06) : 472 - 488
  • [4] A Hybrid Nonlinear ANC for Speech Recovery Using Both Bone- and Air-Conducted Measurements
    Xiao, Ran
    Ma, Yaping
    Huang, Boyan
    Xiao, Yegui
    Hasegawa, Koji
    JOURNAL OF ROBOTICS AND MECHATRONICS, 2015, 27 (05) : 520 - 527
  • [5] Amplitude variation of bone-conducted speech compared with air-conducted speech
    Rahman, M. Shahidur
    Shimamura, Tetsuya
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2019, 40 (05) : 293 - 301
  • [6] MULTISENSORY SPEECH ENHANCEMENT IN NOISY ENVIRONMENTS USING BONE-CONDUCTED AND AIR-CONDUCTED MICROPHONES
    Li, Mingzi
    Cohen, Israel
    Mousazadeh, Saman
    2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 1 - 5
  • [7] ATTENTION-BASED FUSION FOR BONE-CONDUCTED AND AIR-CONDUCTED SPEECH ENHANCEMENT IN THE COMPLEX DOMAIN
    Wang, Heming
    Zhang, Xueliang
    Wang, DeLiang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7757 - 7761
  • [8] Fundamental Frequency Estimation Combining Air-Conducted Speech with Bone-Conducted Speech in Noisy Environment
    Zhang, Shiming
    Sugiura, Yosuke
    Shimamura, Tetsuya
    Makinae, Hisanori
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION ENGINEERING (ECCE), 2017, : 244 - 247
  • [9] Bone-Conducted Speech to Air-Conducted Speech Conversion Based on Cycle-Consistent Adversarial Networks
    Pan, Qing
    Zhou, Jian
    Gao, Teng
    Tao, Liang
    2020 IEEE 3RD INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP 2020), 2020, : 168 - 172
  • [10] Semicircular canal fenestration - improvement of bone- but not air-conducted auditory thresholds
    Sohmer, H
    Freeman, S
    Perez, R
    HEARING RESEARCH, 2004, 187 (1-2) : 105 - 110