Three-dimensional linear articulatory modeling of tongue, lips and face, based on MRI and video images

被引:110
作者
Badin, P [1 ]
Bailly, G
Revéret, L
Baciu, M
Segebarth, C
Savariaux, C
机构
[1] Univ Grenoble 3, CNRS, UMR 5009, INPG,Inst Commun Parlee, Grenoble, France
[2] Univ Mendes, CNRS, UMR 5105, Lab Phychol Expt, Grenoble, France
[3] Univ Grenoble 1, INSERM, U438, LRC CEA, Grenoble, France
关键词
D O I
10.1006/jpho.2002.0166
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
In this study, previous articulatory midsagittal models of tongue and lips are extended to full three-dimensional models. The geometry of these vocal organs is measured on one subject uttering a corpus of sustained articulation in French. The 3D data are obtained from magnetic resonance imaging of the tongue, and from front and profile video images of the subject's face marked with small beads. The degrees of freedom of the articulators, i.e., the uncorrelated linear components needed to represent the 3D coordinates of these articulators, are extracted by linear component anlysis from these data. In addition to a common jaw height parameter, the tongue is controlled by four parameters while the lips and face are also driven by four parameters. These parameters are for the most part extracted from the midsagittal contours, and are clearly interpretable in phonetic/biomechanical terms. This implies that most 3D features such as tongue groove or lateral channels can be controlled by articulatory parameters defined for the midsagittal model. Similarly, the 3D geometry of the lips is determined by parameters such as lip protrusion or aperture, that can be measure from a profile view of the face. © 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:533 / 553
页数:21
相关论文
共 43 条
[1]  
ABRY C, 1994, ADV SPEECH APPL, P182
[2]  
Badin P., 1998, P 3 ESCA COCOSDA INT, P249
[3]  
BADIN P, 2000, P 5 SEM SPEECH PROD, P261
[4]  
BADIN P, 1998, P ESCA TUT RES WORKS, P167
[5]   ANALYSIS OF VOCAL-TRACT SHAPE AND DIMENSIONS USING MAGNETIC-RESONANCE-IMAGING - VOWELS [J].
BAER, T ;
GORE, JC ;
GRACCO, LC ;
NYE, PW .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1991, 90 (02) :799-828
[6]   DERIVING VOCAL-TRACT AREA FUNCTIONS FROM MIDSAGITTAL PROFILES AND FORMANT FREQUENCIES - A NEW MODEL FOR VOWELS AND FRICATIVE CONSONANTS BASED ON EXPERIMENTAL-DATA [J].
BEAUTEMPS, D ;
BADIN, P ;
LABOISSIERE, R .
SPEECH COMMUNICATION, 1995, 16 (01) :27-47
[7]   Linear degrees of freedom in speech production: Analysis of cineradio- and labio-film data and articulatory-acoustic modeling [J].
Beautemps, D ;
Badin, P ;
Bailly, G .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 109 (05) :2165-2180
[8]   ANALYSIS, SYNTHESIS, AND PERCEPTION OF VISIBLE ARTICULATORY MOVEMENTS [J].
BROOKE, NM ;
SUMMERFIELD, Q .
JOURNAL OF PHONETICS, 1983, 11 (01) :63-76
[9]  
COHEN MM, 1996, SPEECHREADING HUMANS, P153
[10]  
DANG J, 2000, P 5 SEM SPEECH PROD, P233