Binaural speech enhancement system combining dereverberation and spatial masking-based noise removal for robust speech recognition

被引:0
作者
Tien Dung Tran [1 ]
Dang Khoa Nguyen [1 ]
Quoc Cuong Nguyen [2 ]
Huu Binh Nguyen [2 ]
机构
[1] Hanoi Univ Sci & Technol, MICA Inst, Hanoi, Vietnam
[2] Hanoi Univ Sci & Technol, Sch Elect Engn, Hanoi, Vietnam
来源
2012 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (ICCE) | 2012年
关键词
Binaural speech enhancement; dereverberation; spatial mask; automatic speech recognition; SEPARATION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we present a binaural speech enhancement system as a preprocessing step for the robust speech recognition. This system employs an existing dereverberation technique followed by a spatial masking-based noise removal algorithm where only signals coming from the desired directions are retained by using a threshold angle. While state-of-the art approaches fix the threshold angle heuristically over all time frames, we propose to consider an adaptive computation where this threshold angle is first learned in several noise-only frames and then updated frame by frame. Speech recognition results in real environment show the effectiveness of the proposed speech enhancement approach.
引用
收藏
页码:345 / 350
页数:6
相关论文
共 14 条
[1]  
[Anonymous], TECH REP
[2]  
[Anonymous], P INTERSPEECH
[3]   Robust automatic speech recognition with missing and unreliable acoustic data [J].
Cooke, M ;
Green, P ;
Josifovski, L ;
Vizinho, A .
SPEECH COMMUNICATION, 2001, 34 (03) :267-285
[4]   Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model [J].
Duong, Ngoc Q. K. ;
Vincent, Emmanuel ;
Gribonval, Remi .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07) :1830-1840
[5]  
Kajita Fumitada Itakura Shoji, 1997, IEEE C AC SPEECH SIG
[6]  
Kim C., 2011, P IEEE INT C AC SPEE, P4574
[7]   Improving Speech Intelligibility in Noise Using Environment-Optimized Algorithms [J].
Kim, Gibak ;
Loizou, Philipos C. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08) :2080-2090
[8]  
Kleinschmidt Tristan Friedrich, 2010, THESIS
[9]  
Makino S, 2007, SIGNALS COMMUN TECHN, P1, DOI 10.1007/978-1-4020-6479-1
[10]   Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero-crossings [J].
Park, Hyung-Min ;
Stern, Richard M. .
SPEECH COMMUNICATION, 2009, 51 (01) :15-25