Speaking in shorthand - A syllable-centric perspective for understanding pronunciation variation

被引:180
作者
Greenberg, S [1 ]
机构
[1] Int Comp Sci Inst, Berkeley, CA 94704 USA
基金
美国国家科学基金会;
关键词
automatic speech recognition; pronunciation variation; spoken language; syllables;
D O I
10.1016/S0167-6393(99)00050-3
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Current-generation automatic speech recognition (ASR) systems model spoken discourse as a quasi-linear sequence of words and phones. Because it is unusual for every phone within a word to be pronounced in a standard ("canonical") way, ASR systems often depend on a multi-pronunciation lexicon to match an acoustic sequence with a lexical unit. Since there are, in practice, many different ways for a word to be pronounced, this standard approach adds a layer of complexity and ambiguity to the decoding process which, if simplified, could potentially improve recognition performance. Systematic analysis of pronunciation variation in a corpus of spontaneous English discourse (Switchboard) demonstrates that the variation observed is more systematic at the level of the syllable than at the phonetic-segment level. Thus, syllabic onsets are realized in canonical form far more frequently than either coda or nuclear constituents. Prosodic prominence and lexical stress also appear to play an important role in pronunciation variation. The governing mechanism is likely to involve the informational valence associated with syllabic and lexical elements, and for this reason pronunciation variation offers a potential window onto the mechanisms responsible for the production and understanding of spoken language. (C) 1999 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:159 / 176
页数:18
相关论文
共 53 条
  • [1] [Anonymous], 1996, PRINCIPLES EXPT PHON
  • [2] [Anonymous], 1997, PROSODY SPEECH UNDER
  • [3] [Anonymous], 1988, PROSODY SPEECH RECOG
  • [4] [Anonymous], 2011, ADVENTURES SHERLOCK
  • [5] [Anonymous], 1989, SPEAKING
  • [6] [Anonymous], 1997, INTRO LANGUAGES WORL
  • [7] Arai T., 1997, P EUR RHOD GREEC, P1011
  • [8] Bernstein Basil, 1973, CLASS CODES CONTROL, V2
  • [9] BERNSTEIN J, 1992, P DARPA SPEECH REC W, P41
  • [10] Byrne W, 1998, INT CONF ACOUST SPEE, P313, DOI 10.1109/ICASSP.1998.674430