Leveraging the temporal dynamics of anticipatory vowel-to-vowel coarticulation in linguistic prediction: A statistical modeling approach

被引：2

作者：

Flego, Stefon ^{[1
]}

Forrest, Jon ^{[2
]}

机构：

[1] Indiana Univ, Dept Linguist, Bloomington, IN 47405 USA

[2] Univ Georgia, Dept Linguist, Athens, GA 30602 USA

来源：

JOURNAL OF PHONETICS | 2021年 / 88卷

关键词：

Anticipatory coarticulation; Linguistic prediction; Spectral change; Formant dynamics; Curve fitting; Bayesian modeling; TIME-COURSE; SPEECH-PERCEPTION; AMERICAN ENGLISH; VERTICAL-BAR; ACOUSTIC ANALYSIS; SPOKEN-LANGUAGE; TRACKING; SPEAKERS; IDENTIFICATION; ARTICULATION;

D O I：

10.1016/j.wocn.2021.101093

中图分类号：

H0 [语言学];

学科分类号：

030303 ; 0501 ; 050102 ;

摘要：

Previous research has shown that coarticulatory information in the signal orients listeners in spoken word recognition, and that articulatory and perceptual dynamics closely parallel one another. The current study uses statistical classification to test the power of time-varying anticipatory coarticulatory information present in the acoustic signal for predicting upcoming sounds in the speech stream. Bayesian mixed-effects multinomial logistic regression models were trained on several different representations of spectral variation present in V-1 in order to predict the identity of V-2 in naturally coarticulated transconsonantal V-1...V-2 sequences. Models trained on simple measures of spectral variation (e.g. formant measures taken at V-1 midpoint) were compared with models trained on more sophisticated time-varying representations (e.g. the estimated coefficients of polynomial curves fit to whole formant trajectories of V-1). Accuracy in predicting V-2 was greater when models were trained on dynamic representations of spectral variation in V-1, and those trained on quadratic and cubic polynomial representations achieved the greatest accuracy, with more than 15 percentage points in correct classification over using midpoint formant frequencies alone. The results demonstrate that spectral representations with high temporal resolution capture more disambiguating anticipatory information available in the signal than representations with lower temporal resolution. (C) 2021 Elsevier Ltd. All rights reserved.

引用

页数：19

共 88 条

[61] COARTICULATION IN VCV UTTERANCES - SPECTROGRAPHIC MEASUREMENTS
OHMAN, SEG
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1966, 39 (01) : 151 - &
[62] Pellegrino F, 2011, LANGUAGE, V87, P539
[63] Meter and speech
Port, RF
[J]. JOURNAL OF PHONETICS, 2003, 31 (3-4) : 599 - 611
[64] How spoken languages work in the absence of an inventory of discrete units
Ramscar, Michael
Port, Robert F.
[J]. LANGUAGE SCIENCES, 2016, 53 : 58 - 74
[65] AN ACOUSTIC ANALYSIS OF V-TO-C AND V-TO-V COARTICULATORY EFFECTS IN CATALAN AND SPANISH VCV SEQUENCES
RECASENS, D
[J]. JOURNAL OF PHONETICS, 1987, 15 (04) : 299 - 312
[66] VOWEL-TO-VOWEL COARTICULATION IN CATALAN VCV SEQUENCES
RECASENS, D
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1984, 76 (06) : 1624 - 1635
[67] Redford M.A., 2019, The Journal of the Acoustical Society of America, V146, P2925
[68] Leveraging audiovisual speech perception to measure anticipatory coarticulation
Redford, Melissa A.
Kallay, Jeffrey E.
Bogdanov, Sergei V.
Vatikiotis-Bateson, Eric
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 144 (04) : 2447 - 2461
[69] Risdal M. L., 2014, U PENNSYLVANIA WORKI, V20, P16
[70] A comparative analysis of speech rate and perception in radio bulletins
Rodero, Emma
[J]. TEXT & TALK, 2012, 32 (03) : 391 - 411

← 1 2 3 4 5 6 7 8 9 →