Analog auditory perception model for robust speaker recognition

被引:0
作者
Deng, Yunbin [1 ]
Xu, Roger [1 ]
机构
[1] Intelligent Automat Inc, Rockville, MD 20855 USA
来源
PROCEEDINGS OF THE EIGHTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING | 2006年
关键词
auditory model; robust speech feature extraction; speaker recognition; analog VLSI; speaker recognition hardware;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
An auditory perception model for noise-robust speech feature extraction is presented to abstract the effective signal processing in human ear. The auditory effect taking into account including: insensitivity to low frequency signal, Mel-scale and multi-scale frequency resolution, static nonlinear compression, and adaptive compression. Unlike the widely used discrete digital signal processing methods, the model assumes continuous-time filtering and rectification, amenable to real-time, low power analog VLSI implementation. A custom chip in 0.5um CMOS technology implements the general form of the model with digitally programmable filter parameters and consumes power of 9mW. Experiments on the YAHO speaker identification database demonstrate consistent robustness of the new features to noise of various statistics, yielding significant improvements in text independent speaker recognition accuracy over models identically trained using Mel-scale Frequency Cepstral Coefficient (MFCC) features.
引用
收藏
页码:443 / +
页数:3
相关论文
共 15 条
[1]  
[Anonymous], 2000, DIGITAL SIGNAL PROCE
[2]  
BEAUFAYS F, 1997, ICASSP 97, V2, P1063
[3]  
CAMPBELL WM, 2003, INT C AC SPEECH SIGN
[4]  
DENG Y, 2004, P IEEE INT JOINT C N
[5]  
DENG Y, 2004, P IEEE INT S CIRC SY
[6]  
DODDINGTON GR, 2000, SPEECH COMMUNICATION
[7]  
FURUI S, 1994, ESCA WORKSH AUT SPEA
[8]   PERCEPTUAL LINEAR PREDICTIVE (PLP) ANALYSIS OF SPEECH [J].
HERMANSKY, H .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 87 (04) :1738-1752
[9]  
*LDC, YOHO SPEAK VER DAT
[10]  
MAMMONE RJ, 1996, SIGNAL PROCESSING MA