Speech Enhancement via Mask-Mapping Based Residual Dense Network

被引：1

作者：

Zhou, Lin ^{[1
]}

Chen, Xijin ^{[1
]}

Wu, Chaoyan ^{[1
]}

Zhong, Qiuyue ^{[1
]}

Cheng, Xu ^{[2
]}

Tang, Yibin ^{[3
]}

机构：

[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Peoples R China

[2] Univ Oulu, Ctr Machine Vis & Signal Anal, FI-90014 Oulu, Finland

[3] Hohai Univ, Coll IOT Engn, Changzhou 213022, Peoples R China

来源：

CMC-COMPUTERS MATERIALS & CONTINUA | 2023年 / 74卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Mask-mapping-based method; residual dense block; speech enhancement; ALGORITHM; NOISE;

D O I：

10.32604/cmc.2023.027379

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network (DNN). But the mapping-based methods only utilizes the phase of noisy speech, which limits the upper bound of speech enhancement performance. Masking-based methods need to accurately estimate the masking which is still the key problem. Combining the advantages of above two types of methods, this paper proposes the speech enhancement algorithm MM-RDN (masking-mapping residual dense network) based on masking-mapping (MM) and residual dense network (RDN). Using the logarithmic power spectrogram (LPS) of consecutive frames, MM estimates the ideal ratio masking (IRM) matrix of consecutive frames. RDN can make full use of feature maps of all layers. Meanwhile, using the global residual learning to combine the shallow features and deep features, RDN obtains the global dense features from the LPS, thereby improves estimated accuracy of the IRM matrix. Simula-tions show that the proposed method achieves attractive speech enhancement performance in various acoustic environments. Specifically, in the untrained acoustic test with limited priors, e.g., unmatched signal-to-noise ratio (SNR) and unmatched noise category, MM-RDN can still outperform the existing convolutional recurrent network (CRN) method in the measures of perceptual evaluation of speech quality (PESQ) and other evaluation indexes. It indicates that the proposed algorithm is more generalized in untrained conditions.

引用

页码：1259 / 1277

页数：19

共 50 条

[1] Speech Enhancement via Residual Dense Generative Adversarial Network
Zhou, Lin
Zhong, Qiuyue
Wang, Tianyi
Lu, Siyuan
Hu, Hongmei
COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 38 (03): : 279 - 289
[2] Deep Residual-Dense Lattice Network for Speech Enhancement
Nikzad, Mohammad
Nicolson, Aaron
Gao, Yongsheng
Zhou, Jun
Paliwal, Kuldip K.
Shang, Fanhua
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8552 - 8559
[3] SHO based Deep Residual network and hierarchical speech features for speech enhancement
Bhosle M.R.
Narayaswamy N.K.
International Journal of Speech Technology, 2023, 26 (02) : 355 - 370
[4] Improving mask learning based speech enhancement system with restoration layers and residual connection
Chen, Zhuo
Huang, Yan
Li, Jinyu
Gong, Yifan
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3632 - 3636
[5] Deep neural network based speech enhancement using mono channel mask
Pallavi P. Ingale
Sanjay L. Nalbalwar
International Journal of Speech Technology, 2019, 22 : 841 - 850
[6] Deep neural network based speech enhancement using mono channel mask
Ingale, Pallavi P.
Nalbalwar, Sanjay L.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 841 - 850
[7] RESIDUAL RECURRENT NEURAL NETWORK FOR SPEECH ENHANCEMENT
Abdulbaqi, Jalal
Gu, Yue
Chen, Shuhong
Marsic, Ivan
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6659 - 6663
[8] Speech Enhancement Method Based on Frequency-Time Dilated Dense Network
Huang X.
Chen H.
Gan L.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (07): : 1628 - 1638
[9] Deep Residual Network-Based Augmented Kalman Filter for Speech Enhancement
Roy, Sujan Kumar
Paliwal, Kuldip K.
2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 667 - 673
[10] Speech enhancement via sparse coding with ideal binary mask
Sun, Juan
Tang, Yibin
Jiang, Aimin
Xu, Ning
Zhou, Lin
2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 537 - 540

← 1 2 3 4 5 →