Deep Noise Tracking Network: A Hybrid Signal Processing/Deep Learning Approach to Speech Enhancement

被引：0

作者：

Nie, Shuai ^{[1
,3
]}

Liang, Shan ^{[1
]}

Liu, Bin ^{[1
,3
]}

Zhang, Yaping ^{[1
,3
]}

Liu, Wenju ^{[1
]}

Tao, Jianhua ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China

[2] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China

[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

基金：

国家重点研发计划;

关键词：

speech enhancement; noise tracking; deep learning; signal processing; RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Noise statistics and speech spectrum characteristics are the essential information for the single channel speech enhancement. The signal processing-based methods mainly rely on noise statistics estimation. They perform very well for stationary noise, but have remained difficult to cope with non-stationary noise. While the deep leaming-based methods mainly focus on the perception on the spectrum characteristics of speech and have a capacity in dealing with non-stationary noise. However, the performance would degrade dramatically for the unseen noise types, which could be due to the over-reliance on data and the ignorance to domain knowledge of signal process. Obviously, the hybrid signal processing/deep learning scheme may be a smart alternative. In this paper, we incorporate the powerful perceptual capabilities of deep learning in the conventional speech enhancement framework. Deep learning is used to estimate the speech presence probability and the update factor of noise statistics, which are then integrated into the Wiener filter-based speech enhancement structure to enhance the desired speech. All components are jointly optimized by a spectrum approximation objective. Systematic experiments on CHiME-4 and NOISEX-92 demonstrate the proposed hybrid signal processing/deep learning approach to noise suppression in noise-unmatched and noise-matched conditions.

引用

页码：3219 / 3223

页数：5

共 50 条

[31] A Primer on Deep Learning Architectures and Applications in Speech Processing [J].

Ogunfunmi, Tokunbo ;

Ramachandran, Ravi Prakash ;

Togneri, Roberto ;

Zhao, Yuanjun ;

Xia, Xianjun .

CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (08) :3406-3432

[32] Deep Learning Assisted Time-Frequency Processing for Speech Enhancement on Drones [J].

Wang, Lin ;

Cavallaro, Andrea .

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (06) :871-881

[33] Regularization by deep learning in signal processing [J].

Villamarin, Carlos Ramirez ;

Suazo, Erwin ;

Oraby, Tamer .

SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (05) :4425-4433

[34] Convolutional Deep Neural Network and Full Connectivity for Speech Enhancement [J].

Alameri, Ban M. ;

Kadhim, Inas Jawad ;

Hadi, Suha Qasim ;

Hassoon, Ali F. ;

Abd, Mustafa M. ;

Premaratne, Prashan .

INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2023, 19 (04) :140-154

[35] Research on network communication signal processing recognition based on deep learning [J].

Yan L.C. .

Telecommunications and Radio Engineering (English translation of Elektrosvyaz and Radiotekhnika), 2020, 79 (07) :583-592

[36] Speech Enhancement: Traditional and Deep Learning Techniques [J].

Gaddamedi, Satya Prasad ;

Patel, Anuj ;

Chandra, Sabyasachi ;

Bharati, Puja ;

Ghosh, Nirmalya ;

Das Mandal, Shyamal Kumar .

PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 :75-86

[37] Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques [J].

Haeb-Umbach, Reinhold ;

Watanabe, Shinji ;

Nakatani, Tomohiro ;

Bacchiani, Michiel ;

Hoffmeister, Bjoern ;

Seltzer, Michael L. ;

Zen, Heiga ;

Souden, Mehrez .

IEEE SIGNAL PROCESSING MAGAZINE, 2019, 36 (06) :111-124

[38] NOISE-ADAPTIVE DEEP NEURAL NETWORK FOR SINGLE-CHANNEL SPEECH ENHANCEMENT [J].

Chung, Hanwook ;

Kim, Taesup ;

Plourde, Eric ;

Champagne, Benoit .

2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,

[39] A Deep Learning Speech Enhancement Architecture Optimised for Speech Recognition and Hearing Aids [J].

Nossier, Soha A. ;

Wall, Julie ;

Moniri, Mansour ;

Glackin, Cornelius ;

Cannings, Nigel .

2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, :553-558

[40] Speech Enhancement Techniques based on Microphone Arrays and Deep Learning [J].

Wang, Xin ;

Guo, Baofeng ;

Huo, Xiaolei ;

Zhang, Yi ;

Tao, Jie .

2024 IEEE 8TH INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING, ICVISP, 2024,

← 1 2 3 4 5 →