Speech enhancement via sparse coding with ideal binary mask

被引：0

作者：

Sun, Juan ^{[1
]}

Tang, Yibin ^{[1
]}

Jiang, Aimin ^{[1
]}

Xu, Ning ^{[1
]}

Zhou, Lin ^{[2
]}

机构：

[1] Hohai Univ, Coll Internet Things Engn, Changzhou, Peoples R China

[2] Southeast Univ, Sch Informat Sci & Engn, Nanjing, Peoples R China

来源：

2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) | 2014年

关键词：

speech enhancement; sparse representation; ideal binary mask; NOISE;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

An improved algorithm is presented for speech enhancement via sparse representation and ideal binary mask (IBM) methods. In the traditional IBM, the basic idea is to identify voiced components as target signal and label unvoiced ones as interference noise vice versa. However, such voiced and unvoiced components still cannot be well separated in target signal and interference noise. To fully exploit the merits of sparse representation theory, we extract the exact voiced component from both the above twofold to obtain the final enhanced speech. Experimental results demonstrate the proposed method can achieve higher PESQ scores than the traditional IBM to efficiently improve speech intelligibility.

引用

页码：537 / 540

页数：4

共 10 条

[1] K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation
Aharon, Michal
Elad, Michael
Bruckstein, Alfred
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) : 4311 - 4322
[2] He YJ, 2012, INT CONF ACOUST SPEE, P4653, DOI 10.1109/ICASSP.2012.6288956
[3] Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction
Li, Ning
Loizou, Philipos C.
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 123 (03) : 1673 - 1682
[4] On the optimality of ideal binary time-frequency masks
Li, Yipeng
Wang, DeLiang
[J]. SPEECH COMMUNICATION, 2009, 51 (03) : 230 - 239
[5] Meyer J, 1997, INT CONF ACOUST SPEE, P1167, DOI 10.1109/ICASSP.1997.596150
[6] Potamitis I, 2001, INT CONF ACOUST SPEE, P621, DOI 10.1109/ICASSP.2001.940908
[7] Schnell M, 2007, 2007 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, P49
[8] SPEECH ENHANCEMENT WITH SPARSE CODING IN LEARNED DICTIONARIES
Sigg, Christian D.
Dikk, Tomas
Buhmann, Joachim M.
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4758 - 4761
[9] SPARSE CODING FOR SPEECH RECOGNITION
Sivaram, G. S. V. S.
Nemala, Sridhar Krishna
Elhilali, Mounya
Trac D. Tran
Hermansky, Hynek
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4346 - 4349
[10] Zhao Y., 2013, J INFORM COMPUTATION

← 1 →