Cochannel Speech Segregation with Sparse Coding

被引：0

作者：

Ingale, Pallavi P. ^{[1
]}

Nalbalwar, S. L. ^{[1
]}

机构：

[1] Dr Babasaheb Ambedkar Technol Univ, Dept Elect & Telecommun Engn, Lonere, India

来源：

2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT) | 2016年

关键词：

Speech segregation; Sparse coding; Computational auditory scene analysis (CASA); ALGORITHM;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Most of the computational auditory scene analysis (CASA) based systems rely on pitch based features. When we go for cochannel speech segregation, two speakers are involved. Pitch ranges for male speech and female speech overlap to a large extent. Therefore multi-pitch tracking becomes a nontrivial task. In case of same gender mixtures, again pitch tracking becomes harder. Considering this fact, we should go for some reliable features. Here we propose a cochannel speech segregation system with sparsity based features. Sparse coding is applied on the cochleagram of the signal to get sparse approximation coefficients using pre-trained dictionaries for speakers. We treat sparse approximation coefficients the features because these are selected from the speaker specific dictionaries to represent an input signal. Sparse approximation coefficients are good choice for finding binary masks. Speech waveform is resynthesized from the masked cochleagram of the mixture. Experimental results show that the proposed method produces better objective intelligibility scores than the baseline system.

引用

页码：4589 / 4592

页数：4

共 50 条

[1] SPARSE CODING FOR SPEECH RECOGNITION
Sivaram, G. S. V. S.
Nemala, Sridhar Krishna
Elhilali, Mounya
Trac D. Tran
Hermansky, Hynek
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4346 - 4349
[2] Optimization of learned dictionary for sparse coding in speech processing
He, Yongjun
Sun, Guanglu
Han, Jiqing
NEUROCOMPUTING, 2016, 173 : 471 - 482
[3] Continuous speech recognition with sparse coding
Smit, W. J.
Barnard, E.
COMPUTER SPEECH AND LANGUAGE, 2009, 23 (02) : 200 - 219
[4] Dictionary evaluation and optimization for sparse coding based speech processing
He, Yongjun
Chen, Deyun
Sun, Guanglu
Han, Jiqing
INFORMATION SCIENCES, 2015, 310 : 77 - 96
[5] SPEECH ENHANCEMENT WITH SPARSE CODING IN LEARNED DICTIONARIES
Sigg, Christian D.
Dikk, Tomas
Buhmann, Joachim M.
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4758 - 4761
[6] Hierarchical sparse coding framework for speech emotion recognition
Torres-Boza, Diana
Oveneke, Meshia Cedric
Wang, Fengna
Jiang, Dongmei
Verhelst, Werner
Sahli, Hichem
SPEECH COMMUNICATION, 2018, 99 : 80 - 89
[7] Spectrum enhancement with sparse coding for robust speech recognition
He, Yongjun
Sun, Guanglu
Han, Jiqing
DIGITAL SIGNAL PROCESSING, 2015, 43 : 59 - 70
[8] Speech Coding based on Compressed Sensing and Sparse Representation
Li, Shangjing
Zhu, Qi
ADVANCES IN COMPUTERS, ELECTRONICS AND MECHATRONICS, 2014, 667 : 242 - 247
[9] Parallel and Hierarchical Decision Making for Sparse Coding in Speech Recognition
Wang, Dong
Vipperla, Ravichander
Evans, Nicholas
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2568 - 2571
[10] Speech enhancement with a GSC-like structure employing sparse coding
Yang, Li-chun
Qian, Yun-tao
JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2014, 15 (12): : 1154 - 1163

← 1 2 3 4 5 →