GLOTTAL MODELS FOR DIGITAL SPEECH PROCESSING - A HISTORICAL SURVEY AND NEW RESULTS

被引:9
作者
CUMMINGS, KE
CLEMENTS, MA
机构
[1] Sch Elect and Comp Engn, Georgia Inst of Technology, Atlanta, GA
关键词
D O I
10.1006/dspr.1995.1003
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Glottal modeling has been an important topic of research in digital speech processing for many years. The ability to accurately model the glottal excitation is important for applications as varied as acoustic and articulatory speech synthesis, speech coding, and speech analysis. Many glottal models that differ in form and complexity have been suggested over the years. Possible models range from simple parametric models of the glottal volume velocity or the glottal flow derivative that assume linear separability of the glottal source and the vocal tract to more complex parametric function and mechanical models that allow for interaction between the glottal source and the vocal tract to very complex models that are based directly on the physiological properties of the glottis. This paper will provide a historical survey of glottal models, discussing their form and complexity along with the applications for which each is appropriate. This paper will also present a discussion of the problem of modeling the glottal excitation of different styles of speech, a topic that is important for applications such as natural, high-quality speech synthesis. A glottal model that is capable of modeling eleven commonly encountered styles of speech will be presented. (C) 1995 Academic Press, Inc.
引用
收藏
页码:21 / 42
页数:22
相关论文
共 62 条
[1]   A MODEL FOR THE SYNTHESIS OF NATURAL SOUNDING VOWELS [J].
ALLEN, DR ;
STRONG, WJ .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1985, 78 (01) :58-69
[2]  
ANANTHAPADMANAB.T, 1984, SPEECH TRANSMISS APR, P1
[3]  
ANANTHAPADMANAB.TV, 1982, SPEECH COMMUN, V1, P167
[4]  
Broad David J., 1979, SPEECH AND LANGUAGE, V2, P203
[5]   EXPERIMENTS WITH VOICE MODELING IN SPEECH SYNTHESIS [J].
CARLSON, R ;
GRANSTROM, B ;
KARLSSON, I .
SPEECH COMMUNICATION, 1991, 10 (5-6) :481-489
[6]  
CHENG Y, 1990, 1990 P INT C AC SPEE, V1, P649
[7]  
CHENG Y, 1986, 1986 P INT C AC SPEE, P2003
[8]   DETECTION OF LARYNGEAL FUNCTION USING SPEECH AND ELECTROGLOTTOGRAPHIC DATA [J].
CHILDERS, DG ;
BAE, KS .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 1992, 39 (01) :19-25
[9]   MODEL OF ARTICULATORY DYNAMICS AND CONTROL [J].
COKER, CH .
PROCEEDINGS OF THE IEEE, 1976, 64 (04) :452-460
[10]   PRESSURE MEASUREMENTS DURING SPEECH PRODUCTION USING SEMICONDUCTOR MINIATURE PRESSURE TRANSDUCERS - IMPACT ON MODELS FOR SPEECH PRODUCTION [J].
CRANEN, B ;
BOVES, L .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1985, 77 (04) :1543-1551