Conversational speech synthesis and the need for some laughter

被引：22

作者：

Campbell, Nick ^{[1
]}

机构：

[1] Natl Inst Informat & Commun Technol, Keihanna Sci City, Kyoto 6190288, Japan

[2] ATR Spoken Language Commun Lab, Speech & Acoust Proc Dept, Keihanna Sci City, Kyoto 6190288, Japan

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 04期

关键词：

affect; conversation; emotion; expression; laughter; nonverbal; social interaction; speech synthesis;

D O I：

10.1109/TASL.2006.876131

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper reports progress in the synthesis of conversational speech, from the viewpoint of work carried out on the analysis of a very large corpus of expressive speech in normal everyday situations. With recent developments in concatenative techniques, speech synthesis has overcome the barrier of realistically portraying extra-linguistic information by using the actual voice of a recognizable person as a source for units, combined with minimal use of signal processing. However, the technology still faces the problem of expressing paralinguistic information, i.e., the variety in the types of speech and laughter that a person might use in everyday social interactions. Paralinguistic modification of an utterance portrays the speaker's affective states and shows his or her relationships with the speaker through variations in the manner of speaking, by means of prosody and voice quality. These inflections are carried on the propositional content of an utterance, and can perhaps be modeled by rule, but they are also expresssed through nonverbal utterances, the complexity of which may be beyond the capabilities of many current synthesis methods. We suggest that this problem may be solved by the use of phrase-sized utterance units taken intact from a large corpus.

引用

页码：1171 / 1178

页数：8

共 39 条

[1] Amplitude domain quotient for characterization of the glottal volume velocity waveform estimated by inverse filtering [J].

Alku, P ;

Vilkman, E .

SPEECH COMMUNICATION, 1996, 18 (02) :131-138

[2]

ALLEN J, 1987, MLTALK SYSTEM

[3]

[Anonymous], P INT ICSLP JEJ ISL

[4]

BLACK AW, 1995, P EUROSPEECH, P81

[5] Getting to the heart of the matter: Speech as the expression of affect; Rather than just text or language [J].

Campbell, N .

LANGUAGE RESOURCES AND EVALUATION, 2005, 39 (01) :109-118

[6]

CAMPBELL N, 2004, J PHONETIC SOC JAPAN, V7, P9

[7]

CAMPBELL N, 2004, SPRINGER LECT NOTES, P221

[8]

CAMPBELL N, 2002, P LANG RES EV C LREC, P2029

[9]

CAMPBELL N, 2004, LECT NOTES COMPUTER

[10]

Campbell Nick, 2004, P LREC 2004, V2004, P183

← 1 2 3 4 →