Unseen Noise Estimation Using Separable Deep Auto Encoder for Speech Enhancement

被引:62
|
作者
Sun, Meng [1 ]
Zhang, Xiongwei [1 ]
Van hamme, Hugo [2 ]
Zheng, Thomas Fang [3 ]
机构
[1] PLA Univ Sci & Technol, Lab Intelligent Informat Proc, Nanjing 210007, Jiangsu, Peoples R China
[2] Katholieke Univ Leuven, Elect Engn Dept ESAT, Speech Proc Res Grp, B-3000 Louvain, Belgium
[3] Tsinghua Univ, Res Inst Informat Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep auto encoder; source separation; speech enhancement; unseen noise compensation; HMM;
D O I
10.1109/TASLP.2015.2498101
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Unseen noise estimation is a key yet challenging step to make a speech enhancement algorithm work in adverse environments. At worst, the only prior knowledge we know about the encountered noise is that it is different from the involved speech. Therefore, by subtracting the components which cannot be adequately represented by a well defined speech model, the noises can be estimated and removed. Given the good performance of deep learning in signal representation, a deep auto encoder (DAE) is employed in this work for accurately modeling the clean speech spectrum. In the subsequent stage of speech enhancement, an extra DAE is introduced to represent the residual part obtained by subtracting the estimated clean speech spectrum (by using the pre-trained DAE) from the noisy speech spectrum. By adjusting the estimated clean speech spectrum and the unknown parameters of the noise DAE, one can reach a stationary point to minimize the total reconstruction error of the noisy speech spectrum. The enhanced speech signal is thus obtained by transforming the estimated clean speech spectrum back into time domain. The above proposed technique is called separable deep auto encoder (SDAE). Given the under-determined nature of the above optimization problem, the clean speech reconstruction is confined in the convex hull spanned by a pre-trained speech dictionary. New learning algorithms are investigated to respect the non-negativity of the parameters in the SDAE. Experimental results on TIMIT with 20 noise types at various noise levels demonstrate the superiority of the proposed method over the conventional baselines.
引用
收藏
页码:93 / 104
页数:12
相关论文
共 50 条
  • [31] Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks
    Kumar, Anurag
    Florencio, Dinei
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3738 - 3742
  • [32] Online noise estimation using stochastic-gain HMM for speech enhancement
    Zhao, David Y.
    Kleijn, W. Bastiaan
    Ypma, Alexander
    de Vries, Bert
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (04): : 835 - 846
  • [33] Variance Normalized Perceptual Subspace Speech Enhancement With Noise Estimation Using SPP
    Surendran, Sudeep
    Kumar, T. Kishore
    2016 INTERNATIONAL CONFERENCE ON NEXT GENERATION INTELLIGENT SYSTEMS (ICNGIS), 2016, : 364 - 369
  • [34] Speech Enhancement Based on Adaptive Noise Power Estimation Using Spectral Difference
    Choi, Jae-Hun
    Chang, Joon-Hyuk
    Kim, Dong Kook
    Kim, Suhyun
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2011, E94A (10) : 2031 - 2034
  • [35] Speech Enhancement Using Successive State Estimation under Industrial Noise Environment
    Wu, Qinghe
    Wu, Haifeng
    Zeng, Yu
    PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND SYSTEMS (ICACS 2018), 2018, : 214 - 219
  • [36] Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation
    Hao, Jiucang
    Attias, Hagai
    Nagarajan, Srikantan
    Lee, Te-Won
    Sejnowski, Terrence J.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (01): : 24 - 37
  • [37] Noise Estimation Using Mean Square Cross Prediction Error for Speech Enhancement
    Wang, Gang
    Li, Chunguang
    Dong, Le
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2010, 57 (07) : 1489 - 1499
  • [38] Single Channel Speech Enhancement: using Wiener Filtering with Recursive Noise Estimation
    Upadhyay, Navneet
    Jaiswal, Rahul Kumar
    PROCEEDING OF THE SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN COMPUTER INTERACTION (IHCI 2015), 2016, 84 : 22 - 30
  • [39] ADAPTIVE NOISE POWER ESTIMATION USING SPECTRAL DIFFERENCE FOR ROBUST SPEECH ENHANCEMENT
    Choi, Jae-Hun
    Kim, Sang-Kyun
    Chang, Joon-Hyuk
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4649 - 4652
  • [40] A New Spectral Subtraction Method for Speech Enhancement using Adaptive Noise Estimation
    Bharti, Shambhu Shankar
    Gupta, Manish
    Agarwal, Suneeta
    2016 3RD INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN INFORMATION TECHNOLOGY (RAIT), 2016, : 128 - 132