Vowel Normalisation in Latent Space for Sociolinguistics

被引:0
作者
Burridge, James [1 ]
机构
[1] Univ Portsmouth, Sch Math & Phys, Portsmouth, Hants, England
来源
INTERSPEECH 2023 | 2023年
关键词
normalisation; formants; vowels; dialects; sociolinguistics; ACOUSTIC CHARACTERISTICS; FORMANTS; ENGLISH;
D O I
10.21437/Interspeech.2023-1704
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
To study variations in vowel sounds between different socio-linguistic groups, sounds must be normalized to minimize variations caused by physical factors. The Lobanov method, for example, standardizes formant distributions by speaker. Since formants are often difficult to measure, and offer only a partial description of sounds, a robust and reproducible normalisation method based on the whole spectrum would be useful. One candidate is speaker-level standardization in the latent space of a variational auto-encoder, trained on a large sample of vowel spectra. We show that whole spectrum transformations induced by latent normalisation shift formants similarly to direct formant normalisation. We also show that formant-based normalisation procedures can be used to induce whole-spectrum transformations via latent space.
引用
收藏
页码:3547 / 3551
页数:5
相关论文
共 38 条
[1]  
Allen J., 1987, TEXT SPEECH, P108
[2]  
[Anonymous], About us
[3]  
[Anonymous], 2015, US
[4]  
Benesty J., 2008, Springer Handbook of Speech Processing, DOI [DOI 10.1007/978-3-540-49127-9, DOI 10.1007/978-3-540-49127-9_8]
[5]   Modeling and Transforming Speech using Variational Autoencoders [J].
Blaauw, Merlijn ;
Bonada, Jordi .
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :1770-1774
[6]   Variational Inference: A Review for Statisticians [J].
Blei, David M. ;
Kucukelbir, Alp ;
McAuliffe, Jon D. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877
[7]  
Chambers J.K., 1998, DIALECTOLOGY
[8]   Acoustic characteristics of the vowel systems of six regional varieties of American English [J].
Clopper, CG ;
Pisoni, DB ;
de Jong, K .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2005, 118 (03) :1661-1676
[9]  
Disner S.F., 1983, UCLA WORKING PAPERS, V58
[10]  
Eide E, 1996, INT CONF ACOUST SPEE, P346, DOI 10.1109/ICASSP.1996.541103