Acoustic variability and automatic recognition of children's speech

被引:83
作者
Gerosa, Matteo [1 ]
Giuliani, Diego [1 ]
Brugnara, Fabio [1 ]
机构
[1] Univ Trent, Ctr Ric Sci & Technol, Ist Ric Sci & Tecnol, I-38050 Trento, Italy
关键词
children's speech analysis; automatic speech recognition for children; speaker normalization; speaker adaptive acoustic modeling;
D O I
10.1016/j.specom.2007.01.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents several acoustic analyses carried out on read speech collected from Italian children aged from 7 to 13 years and North American children aged from 5 to 17 years. These analyses aimed at achieving a better understanding of spectral and temporal changes in speech produced by children of various ages in view of the development of automatic speech recognition applications. The results of these analyses confirm and complement the results reported in the literature, showing that characteristics of children's speech change with age and that spectral and temporal variability decrease as age increases. In fact, younger children show a substantially higher intra- and inter-speaker variability with respect to older children and adults. We investigated the use of several methods for speaker adaptive acoustic modeling to cope with inter-speaker spectral variability and to improve recognition performance for children. These methods proved to be effective in recognition of read speech with a vocabulary of about 11k words. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:847 / 860
页数:14
相关论文
共 47 条
[1]  
ACKERMANN U, 1997, P EUROSPEECH, P1807
[2]  
Anastasakos T, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P1137, DOI 10.1109/ICSLP.1996.607807
[3]  
ANGELINI B, 1994, P ICSLP YOK, P1391
[4]  
[Anonymous], P ANN C INT SPEECH C
[5]  
[Anonymous], 2001, P 17 EUR C SPEECH CO
[6]  
BANERJEE S, 2003, P EUROSPEECH GEN SWI
[7]  
Bertoldi N, 2001, INT CONF ACOUST SPEE, P37, DOI 10.1109/ICASSP.2001.940761
[8]  
Boersma P., 2021, PRAAT DOING PHONETIC, DOI DOI 10.1097/AUD.0B013E31821473F7
[9]  
BRUGNARA F, 2002, P ICSLP DENV CO, P1441
[10]  
Burnett DC, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P1145, DOI 10.1109/ICSLP.1996.607809