Multichannel speech enhancement algorithm based on hybrid reverberation model

被引:0
作者
Xie, Yuan [1 ]
Zou, Tao [1 ]
Sun, Weijun [2 ]
Xie, Shengli [2 ]
机构
[1] School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou
[2] Key Laboratory of Intelligent Information Processing and System Integration of Internet of Things, Ministry of Education, Guangdong University of Technology, Guangzhou
来源
Tongxin Xuebao/Journal on Communications | 2024年 / 45卷 / 11期
基金
中国国家自然科学基金;
关键词
Kalman filter; multichannel speech enhancement; polynomial matrix eigenvalue decomposition;
D O I
10.11959/j.issn.1000-436x.2024197
中图分类号
学科分类号
摘要
To solve the speech enhancement problem in reverberation and noise scenarios, a new speech enhancement model was constructed integrating multichannel linear prediction model and spatial coherence model, and then a multichannel speech enhancement algorithm based on a hybrid reverberation model was designed. The post-reverberation was divided into two components, which were modeled using a multichannel linear prediction model and a spatial coherence model, respectively. To optimize the model parameters, a Kalman filter was used to update the model parameters and polynomial matrix eigenvalue decomposition was used for spatial, temporal, and frequency decorrelation to achieve reverberation and noise reduction. Experimental results show that the proposed algorithm can enhance speech in high and low-reverberation noise environments, and its enhancement effect is superior to popular speech enhancement algorithms, the performance indicators of speech enhancement, perceptual evaluation of speech quality score (PESQ) value and short-time objective intelligibility (STOI) value, have increased by 30% and 20%, respectively. © 2024 Editorial Board of Journal on Communications. All rights reserved.
引用
收藏
页码:15 / 26
页数:11
相关论文
共 33 条
  • [1] HOANG P, HAAN J M D, TAN Z H, Et al., Multichannel speech enhancement with own voice-based interfering speech suppression for hearing assistive devices, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, pp. 706-720, (2022)
  • [2] OZTURK M Z, WU C S, WANG B B, Et al., RadioSES: mmWave-based audioradio speech enhancement and separation system, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31, pp. 1333-1347, (2023)
  • [3] ZHANG L, WANG H T, YANG S, Et al., Single-channel deep time-domain speech enhancement networks for cabin environments, Acta Acustica, 48, 4, pp. 890-900, (2023)
  • [4] EVERS C, NAYLOR P A., Acoustic SLAM, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26, 9, pp. 1484-1498, (2018)
  • [5] CHEN J D, BENESTY J, HUANG Y T, Et al., New insights into the noise reduction Wiener filter, IEEE Transactions on Audio, Speech, and Language Processing, 14, 4, pp. 1218-1234, (2006)
  • [6] CHEN Z, WANG R, YIN F L, Et al., Speech dereverberation method based on spectral subtraction and spectral line enhancement, Applied Acoustics, 112, pp. 201-210, (2016)
  • [7] SAYOUD A, DJENDI M, MEDAHI S, Et al., A dual fast NLMS adaptive filtering algorithm for blind speech quality enhancement, Applied Acoustics, 135, pp. 101-110, (2018)
  • [8] SURENDRAN S, KUMAR T K., Oblique projection and cepstral subtraction in signal subspace speech enhancement for colored noise reduction, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26, 12, pp. 2328-2340, (2018)
  • [9] LUO Y, MESGARANI N., Conv-TasNet: surpassing ideal time-frequency magnitude masking for speech separation, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27, 8, pp. 1256-1266, (2019)
  • [10] FAN J Y, YANG J B, ZHANG X W, Et al., Monaural speech enhancement using U-net fused with multi-head self-attention, Acta Acustica, 47, 6, pp. 703-716, (2022)