Speech enhancement method based on multi-domain fusion and neural architecture search

被引：0

作者：

Zhang R. ^{[1
]}

Zhang P. ^{[1
]}

Sun C. ^{[1
]}

机构：

[1] College of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan

来源：

Tongxin Xuebao/Journal on Communications | 2024年 / 45卷 / 02期

基金：

中国国家自然科学基金;

关键词：

complex neural architecture search; complex spatial domain mapping; low-cost evaluation; multi-domain fusion; speech enhancement model;

D O I：

10.11959/j.issn.1000-436x.2024018

中图分类号：

学科分类号：

摘要：

In order to further improve the self-learning and noise reduction ability of speech enhancement model, a speech enhancement method based on multi-domain fusion and neural architecture search was proposed. The multi-spatial domain mapping and fusion mechanism of speech signals were designed to realize the mining of real complex number correlation. Based on the characteristics of convolution pooling of the model, a complex neural architecture search mechanism was proposed, and the speech enhancement model was constructed efficiently and automatically through the designed search space, search strategy and evaluation strategy. In the comparison and generalization experiment between the optimal speech enhancement model and the baseline model, the two indexes of PESQ and STOI increase by 5.6% compared with the optimal baseline model, and the number of model parameters is the lowest. © 2024 Editorial Board of Journal on Communications. All rights reserved.

引用

页码：225 / 239

页数：14

共 26 条

[1] XIE Y, ZOU T, SUN W J, Et al., Algorithm of underdetermined convolutive blind source separation for high reverberation environment, Journal on Communications, 44, 2, pp. 82-93, (2023)
[2] GHOLAMIANGONABADI D, GROLINGER K., Personalized models for human activity recognition with wearable sensors: deep neural networks and signal processing, Applied Intelligence, 53, 5, pp. 6041-6061, (2023)
[3] YIN D C, LUO C, XIONG Z W, Et al., PHASEN: a phase-and-harmonics-aware speech enhancement network, Proceedings of the AAAI Conference on Artificial Intelligence, pp. 9458-9465, (2020)
[4] TAN K, WANG D L., A convolutional recurrent neural network for real-time speech enhancement, Proceedings of the Interspeech, pp. 3229-3233, (2018)
[5] CHOI H S, KIM J H, HUH J, Et al., Phase-aware speech enhancement with deep complex U-net, (2019)
[6] RONNEBERGER O, FISCHER P, BROX T., U-Net: convolutional networks for biomedical image segmentation, (2015)
[7] HU Y X, LIU Y, LV S B, Et al., DCCRN: deep complex convolution recurrent network for phase-aware speech enhancement, Proceedings of the Interspeech, pp. 2472-2476, (2020)
[8] BIAN Y J, SONG Q Q, DU M N, Et al., Subarchitecture ensemble pruning in neural architecture search, IEEE Transactions on Neural Networks and Learning Systems, 33, 12, pp. 7928-7936, (2022)
[9] BAKER B, GUPTA O, NAIK N, Et al., Designing neural network architectures using reinforcement learning, (2016)
[10] BEECHE C, SINGH J P, LEADER J K, Et al., Super U-Net: a modularized generalizable architecture, Pattern Recognition, 128, (2022)

← 1 2 3 →