The Information Structure-prosody interface in text-to-speech technologies. An empirical perspective

被引:1
作者
Dominguez, Monica [1 ]
Farrus, Mireia [2 ]
Wanner, Leo [1 ,3 ]
机构
[1] Univ Pompeu Fabra, Barcelona, Spain
[2] Univ Barcelona, Barcelona, Spain
[3] Catalan Inst Res & Adv Studies ICREA, Barcelona, Spain
基金
欧盟地平线“2020”;
关键词
communicative structure; information structure; intonation; prosody; rheme; specifier; thematicity; theme; ToBI;
D O I
10.1515/cllt-2020-0008
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The correspondence between the communicative intention of a speaker in terms of Information Structure and the way this speaker reflects communicative aspects by means of prosody have been a fruitful field of study in Linguistics. However, text-to-speech applications still lack the variability and richness found in human speech in terms of how humans display their communication skills. Some attempts were made in the past to model one aspect of Information Structure, namely thematicity for its application to intonation generation in text-to-speech technologies. Yet, these applications suffer from two limitations: (i) they draw upon a small number of made-up simple question-answer pairs rather than on real (spoken or written) corpus material; and (ii) they do not explore whether any other interpretation would better suit a wider range of textual genres beyond dialogs. In this paper, two different interpretations of thematicity in the field of speech technologies are examined: the state-of-art binary (and flat) theme-rheme, and the hierarchical thematicity defined by Igor Mel'cuk within the Meaning-Text Theory. The outcome of the experiments on a corpus of native speakers of US English suggests that the latter interpretation of thematicity has a versatile implementation potential for text-to-speech applications of the Information Structure-prosody interface.
引用
收藏
页码:419 / 445
页数:27
相关论文
共 59 条
[1]  
[Anonymous], 2006, The intonation of givenness: Evidence from German
[2]  
[Anonymous], 1983, Prosody: Models and Measurements, DOI 10.1007/978-3-642-69103-4_6
[3]  
[Anonymous], 2013, P 6 INT JOINT C NAT
[4]  
B?ring Daniel., 2008, STUDIES LINGUISTICS, V82
[5]  
Ballesteros Miguel, 2015, P 2015 C N AM CHAPT
[6]  
Beckman M., 1986, PHONOLOGY YB, V3, P255, DOI [10.1017/S095267570000066X, DOI 10.1017/S095267570000066X]
[7]  
Black A.W., 1997, The Festival Speech Synthesis System
[8]  
Boersma P., 2021, Praat: Doing phonetics by computer
[9]  
Bouayad-Agha Nadjet., 2012, ACM Transactions on Speech and Language Processing, V9, P3, DOI 10.1145/2287710.2287711
[10]  
Büring D, 2003, LINGUIST PHILOS, V26, P511