PITCH AND VOICED UNVOICED DETERMINATION WITH AN AUDITORY MODEL

被引:62
作者
VANIMMERSEEL, LM
MARTENS, JP
机构
[1] Immerseel and Jean-Pierre Martensa) Electronics Laboratory, University of Ghent, St.-Pietersnieuwstraat, 41, B-9000, Gent
关键词
D O I
10.1121/1.402840
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, an accurate pitch and voiced/unvoiced determination algorithm for speech analysis is described. The algorithm is called AMPEX (auditory model-based pitch extractor) and it performs a temporal analysis of the outputs emerging from a new auditory model. However, in spite of its use of an auditory model, AMPEX should not be regarded as a substitute for any psychophysical theory of human auditory pitch perception. What is mainly described is the design of a computationally efficient auditory model, the perceptually motivated determination of the model parameters, the conception of a reliable pitch extractor for speech analysis, and the elaboration of an experimental procedure for evaluating the performance of such a pitch extractor. In the course of the evaluation experiment several kinds of speech stimuli including clean speech, bandpass-filtered speech, and noisy speech were presented to three different pitch extractors. The experimental results clearly indicate that AMPEX outperforms the best algorithms available.
引用
收藏
页码:3511 / 3526
页数:16
相关论文
共 34 条
[1]   MODELING THE PERCEPTION OF CONCURRENT VOWELS - VOWELS WITH DIFFERENT FUNDAMENTAL FREQUENCIES [J].
ASSMANN, PF ;
SUMMERFIELD, Q .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 88 (02) :680-697
[2]   APPLICATION OF AN AUDITORY MODEL TO SPEECH RECOGNITION [J].
COHEN, JR .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1989, 85 (06) :2623-2629
[3]   COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].
DAVIS, SB ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366
[4]  
DEPUYDET L, 1992, SPEECH RECOGNITION U
[5]  
ELMALLAWANY II, 1977, RECHERCHES ACOUSTIQU, V4, P15
[6]  
Ghitza O., 1986, Computer Speech and Language, V1, P109, DOI 10.1016/S0885-2308(86)80018-3
[7]  
GOLD B, 1962, DIGITAL PROCESSING S
[8]   OPTIMUM PROCESSOR THEORY FOR CENTRAL FORMATION OF PITCH OF COMPLEX TONES [J].
GOLDSTEIN, JL .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1973, 54 (06) :1496-1516
[9]   CRITICAL BANDWIDTH AND FREQUENCY COORDINATES OF BASILAR MEMBRANE [J].
GREENWOOD, D .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1961, 33 (10) :1344-&
[10]   MEASUREMENT OF PITCH BY SUBHARMONIC SUMMATION [J].
HERMES, DJ .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1988, 83 (01) :257-264