Cochannel Speech Segregation with Sparse Coding

被引:0
|
作者
Ingale, Pallavi P. [1 ]
Nalbalwar, S. L. [1 ]
机构
[1] Dr Babasaheb Ambedkar Technol Univ, Dept Elect & Telecommun Engn, Lonere, India
来源
2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT) | 2016年
关键词
Speech segregation; Sparse coding; Computational auditory scene analysis (CASA); ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Most of the computational auditory scene analysis (CASA) based systems rely on pitch based features. When we go for cochannel speech segregation, two speakers are involved. Pitch ranges for male speech and female speech overlap to a large extent. Therefore multi-pitch tracking becomes a nontrivial task. In case of same gender mixtures, again pitch tracking becomes harder. Considering this fact, we should go for some reliable features. Here we propose a cochannel speech segregation system with sparsity based features. Sparse coding is applied on the cochleagram of the signal to get sparse approximation coefficients using pre-trained dictionaries for speakers. We treat sparse approximation coefficients the features because these are selected from the speaker specific dictionaries to represent an input signal. Sparse approximation coefficients are good choice for finding binary masks. Speech waveform is resynthesized from the masked cochleagram of the mixture. Experimental results show that the proposed method produces better objective intelligibility scores than the baseline system.
引用
收藏
页码:4589 / 4592
页数:4
相关论文
共 50 条
  • [1] SPARSE CODING FOR SPEECH RECOGNITION
    Sivaram, G. S. V. S.
    Nemala, Sridhar Krishna
    Elhilali, Mounya
    Trac D. Tran
    Hermansky, Hynek
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4346 - 4349
  • [2] Optimization of learned dictionary for sparse coding in speech processing
    He, Yongjun
    Sun, Guanglu
    Han, Jiqing
    NEUROCOMPUTING, 2016, 173 : 471 - 482
  • [3] Continuous speech recognition with sparse coding
    Smit, W. J.
    Barnard, E.
    COMPUTER SPEECH AND LANGUAGE, 2009, 23 (02) : 200 - 219
  • [4] Dictionary evaluation and optimization for sparse coding based speech processing
    He, Yongjun
    Chen, Deyun
    Sun, Guanglu
    Han, Jiqing
    INFORMATION SCIENCES, 2015, 310 : 77 - 96
  • [5] SPEECH ENHANCEMENT WITH SPARSE CODING IN LEARNED DICTIONARIES
    Sigg, Christian D.
    Dikk, Tomas
    Buhmann, Joachim M.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4758 - 4761
  • [6] Hierarchical sparse coding framework for speech emotion recognition
    Torres-Boza, Diana
    Oveneke, Meshia Cedric
    Wang, Fengna
    Jiang, Dongmei
    Verhelst, Werner
    Sahli, Hichem
    SPEECH COMMUNICATION, 2018, 99 : 80 - 89
  • [7] Spectrum enhancement with sparse coding for robust speech recognition
    He, Yongjun
    Sun, Guanglu
    Han, Jiqing
    DIGITAL SIGNAL PROCESSING, 2015, 43 : 59 - 70
  • [8] Speech Coding based on Compressed Sensing and Sparse Representation
    Li, Shangjing
    Zhu, Qi
    ADVANCES IN COMPUTERS, ELECTRONICS AND MECHATRONICS, 2014, 667 : 242 - 247
  • [9] Parallel and Hierarchical Decision Making for Sparse Coding in Speech Recognition
    Wang, Dong
    Vipperla, Ravichander
    Evans, Nicholas
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2568 - 2571
  • [10] Speech enhancement with a GSC-like structure employing sparse coding
    Yang, Li-chun
    Qian, Yun-tao
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2014, 15 (12): : 1154 - 1163