Deep Noise Tracking Network: A Hybrid Signal Processing/Deep Learning Approach to Speech Enhancement

被引:0
作者
Nie, Shuai [1 ,3 ]
Liang, Shan [1 ]
Liu, Bin [1 ,3 ]
Zhang, Yaping [1 ,3 ]
Liu, Wenju [1 ]
Tao, Jianhua [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[2] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
基金
国家重点研发计划;
关键词
speech enhancement; noise tracking; deep learning; signal processing; RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Noise statistics and speech spectrum characteristics are the essential information for the single channel speech enhancement. The signal processing-based methods mainly rely on noise statistics estimation. They perform very well for stationary noise, but have remained difficult to cope with non-stationary noise. While the deep leaming-based methods mainly focus on the perception on the spectrum characteristics of speech and have a capacity in dealing with non-stationary noise. However, the performance would degrade dramatically for the unseen noise types, which could be due to the over-reliance on data and the ignorance to domain knowledge of signal process. Obviously, the hybrid signal processing/deep learning scheme may be a smart alternative. In this paper, we incorporate the powerful perceptual capabilities of deep learning in the conventional speech enhancement framework. Deep learning is used to estimate the speech presence probability and the update factor of noise statistics, which are then integrated into the Wiener filter-based speech enhancement structure to enhance the desired speech. All components are jointly optimized by a spectrum approximation objective. Systematic experiments on CHiME-4 and NOISEX-92 demonstrate the proposed hybrid signal processing/deep learning approach to noise suppression in noise-unmatched and noise-matched conditions.
引用
收藏
页码:3219 / 3223
页数:5
相关论文
共 50 条
  • [21] Speech Segregation in Background Noise Based on Deep Learning
    Awotunde, Joseph Bamidele
    Ogundokun, Roseline Oluwaseun
    Ayo, Femi Emmanuel
    Matiluko, Opeyemi Emmanuel
    IEEE ACCESS, 2020, 8 : 169568 - 169575
  • [22] An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning
    Gutierrez-Munoz, Michelle
    Coto-Jimenez, Marvin
    COMPUTATION, 2022, 10 (06)
  • [23] Improving Speech Enhancement in Unseen Noise Using Deep Convolutional Neural Network
    Yuan W.-H.
    Sun W.-Z.
    Xia B.
    Ou S.-F.
    Zidonghua Xuebao/Acta Automatica Sinica, 2018, 44 (04): : 751 - 759
  • [24] Ensemble deep learning in speech signal tasks: A review
    Tanveer, M.
    Rastogi, Aryan
    Paliwal, Vardhan
    Ganaie, M. A.
    Malik, A. K.
    Del Ser, Javier
    Lin, Chin-Teng
    NEUROCOMPUTING, 2023, 550
  • [25] A Reduced Complexity MFCC-based Deep Neural Network Approach for Speech Enhancement
    Razani, Ryan
    Chung, Hanwook
    Attabi, Yazid
    Champagne, Benoit
    2017 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2017, : 331 - 336
  • [26] A Perceptually Motivated Approach for Speech Enhancement Based on Deep Neural Network
    Han, Wei
    Zhang, Xiongwei
    Min, Gang
    Sun, Meng
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (04): : 835 - 838
  • [27] Music Deep Learning: Deep Learning Methods for Music Signal Processing-A Review of the State-of-the-Art
    Moysis, Lazaros
    Iliadis, Lazaros Alexios
    Sotiroudis, Sotirios P.
    Boursianis, Achilles D.
    Papadopoulou, Maria S.
    Kokkinidis, Konstantinos-Iraklis D.
    Volos, Christos
    Sarigiannidis, Panagiotis
    Nikolaidis, Spiridon
    Goudos, Sotirios K.
    IEEE ACCESS, 2023, 11 : 17031 - 17052
  • [28] A Primer on Deep Learning Architectures and Applications in Speech Processing
    Ogunfunmi, Tokunbo
    Ramachandran, Ravi Prakash
    Togneri, Roberto
    Zhao, Yuanjun
    Xia, Xianjun
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (08) : 3406 - 3432
  • [29] A HYBRID APPROACH TO COMBINING CONVENTIONAL AND DEEP LEARNING TECHNIQUES FOR SINGLE-CHANNEL SPEECH ENHANCEMENT AND RECOGNITION
    Tu, Yan-Hui
    Tashev, Ivan
    Zarar, Shuayb
    Lee, Chin-Hui
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2531 - 2535
  • [30] A Primer on Deep Learning Architectures and Applications in Speech Processing
    Tokunbo Ogunfunmi
    Ravi Prakash Ramachandran
    Roberto Togneri
    Yuanjun Zhao
    Xianjun Xia
    Circuits, Systems, and Signal Processing, 2019, 38 : 3406 - 3432