APPLICATIONS OF ADAPTIVE WAVELETS FOR SPEECH

被引:22
作者
KADAMBE, S [1 ]
SRINIVASAN, P [1 ]
机构
[1] PURDUE UNIV,SCH ELECT ENGN,W LAFAYETTE,IN 47907
关键词
ADAPTIVE WAVELET TRANSFORMS; PHONEME; VOICED SOUNDS; UNVOICED SOUNDS; CLASSIFICATION; NEURAL NETWORKS; SIGNAL APPROXIMATION; SPEAKER IDENTIFICATION; SUPER WAVELETS;
D O I
10.1117/12.172410
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Our objective is to demonstrate the applicability of adaptive wavelets for speech applications. In particular, we discuss two applications, namely, classification of unvoiced sounds and speaker identification. First, a method to classify unvoiced sounds using adaptive wavelets, which would help in developing a unified algorithm to classify phonemes (speech sounds), is described. Next, the applicability of adaptive wavelets to identify speakers using very short speech data (one pitch period) is exhibited. The described text-independent phoneme based speaker identification algorithm identifies a speaker by first modeling phonemes and then by clustering all the phonemes belonging to the same speaker into one class. For both applications, we use feed-forward neural network architecture. We demonstrate the performance of both unvoiced sounds classifier and speaker identification algorithms by using representative real speech examples.
引用
收藏
页码:2204 / 2211
页数:8
相关论文
共 30 条
[1]   EFFECTIVENESS OF LINEAR PREDICTION CHARACTERISTICS OF SPEECH WAVE FOR AUTOMATIC SPEAKER IDENTIFICATION AND VERIFICATION [J].
ATAL, BS .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (06) :1304-1312
[2]  
ATTILI J, 1988, P IEEE INT C AC SPEE, P599
[3]  
Campbell Jr. J. P., 1992, THESIS OKLAHOMA STAT
[4]  
CASASENT D, 1992, P SOC PHOTO-OPT INS, V1702, P2, DOI 10.1117/12.57042
[5]   ENTROPY-BASED ALGORITHMS FOR BEST BASIS SELECTION [J].
COIFMAN, RR ;
WICKERHAUSER, MV .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1992, 38 (02) :713-718
[6]  
DALESSANDRO C, 1992, OCT P IEEE SP INT S, P41
[7]  
Daubechies I., 1992, 10 LECT WAVELETS, DOI 10.1137/1.9781611970104
[8]   COMPLETE DISCRETE 2-D GABOR TRANSFORMS BY NEURAL NETWORKS FOR IMAGE-ANALYSIS AND COMPRESSION [J].
DAUGMAN, JG .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1988, 36 (07) :1169-1179
[9]   IMAGE COMPRESSION THROUGH WAVELET TRANSFORM CODING [J].
DEVORE, RA ;
JAWERTH, B ;
LUCIER, BJ .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1992, 38 (02) :719-746
[10]   SPEAKER RECOGNITION - IDENTIFYING PEOPLE BY THEIR VOICES [J].
DODDINGTON, GR .
PROCEEDINGS OF THE IEEE, 1985, 73 (11) :1651-1664