Improving the Performance of Deep Learning Based Speech Enhancement System Using Fuzzy Restricted Boltzmann Machine

被引:6
作者
Samui, Suman [1 ]
Chakrabarti, Indrajit [1 ]
Ghosh, Soumya K. [1 ]
机构
[1] Indian Inst Technol Kharagpur, Kharagpur 721302, W Bengal, India
来源
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017 | 2017年 / 10597卷
关键词
Speech enhancement; Deep learning; Deep neural network; Restricted Boltzmann machine (RBM); Fuzzy restricted Boltzmann machine (FRBM); Unsupervised pre-training; Speech quality; Speech intelligibility;
D O I
10.1007/978-3-319-69900-4_68
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Supervised speech enhancement based on machine learning is a new paradigm for segregating clean speech from background noise. The current work represents a supervised speech enhancement system based on a robust deep learning method where the pre-training phase of deep belief network (DBN) has been conducted by employing fuzzy restricted Boltzmann machines (FRBM) instead of regular RBM. It has been observed that the performance of FRBM model is superior to that of RBM model particularly when the training data is noisy. Our experimental results on various noise scenarios have shown that the proposed approach outperforms the conventional DNN-based speech enhancement methods which use regular RBM for unsupervised pre-training.
引用
收藏
页码:534 / 542
页数:9
相关论文
共 13 条
[1]  
[Anonymous], 2013, COMPUT REV
[2]   Fuzzy Restricted Boltzmann Machine for the Enhancement of Deep Learning [J].
Chen, C. L. Philip ;
Zhang, Chun-Yang ;
Chen, Long ;
Gan, Min .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2015, 23 (06) :2163-2173
[3]   A Feature Study for Classification-Based Speech Separation at Low Signal-to-Noise Ratios [J].
Chen, Jitong ;
Wang, Yuxuan ;
Wang, DeLiang .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) :1993-2002
[4]  
Erhan D, 2010, J MACH LEARN RES, V11, P625
[5]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[6]   Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems [J].
Kolbaek, Morten ;
Tan, Zheng-Hua ;
Jensen, Jesper .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) :153-167
[7]   Improved single channel phase-aware speech enhancement technique for low signal-to-noise ratio signal [J].
Samui, Suman ;
Chakrabarti, Indrajit ;
Ghosh, Soumya Kanti .
IET SIGNAL PROCESSING, 2016, 10 (06) :641-650
[8]  
Wang D. L., 2006, Computational auditory scene analysis: Principles, algorithms, and applications
[9]   On Training Targets for Supervised Speech Separation [J].
Wang, Yuxuan ;
Narayanan, Arun ;
Wang, DeLiang .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) :1849-1858
[10]   Towards Scaling Up Classification-Based Speech Separation [J].
Wang, Yuxuan ;
Wang, DeLiang .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (07) :1381-1390