Aging speech recognition with speaker adaptation techniques: Study on medium vocabulary continuous Bengali speech

被引：4

作者：

Das, Biswajit ^{[1
]}

Mandal, Sandipan ^{[1
]}

Mitra, Pabitra ^{[1
]}

Basu, Anupam ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Comp Sci & Engn, Kharagpur 721302, W Bengal, India

来源：

PATTERN RECOGNITION LETTERS | 2013年 / 34卷 / 03期

关键词：

Aging speech recognition; Vocal tract length normalization (VTLN); Maximum likelihood linear transform (MLLT); Maximum likelihood linear regression (MLLR); Maximum a posteriori (MAP); Maximum mutual information estimation (MMIE); VOCAL-TRACT; EXPECTATION MAXIMIZATION; NORMALIZATION; AGE;

D O I：

10.1016/j.patrec.2012.10.029

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The article describes the speech recognition system development in Bengali language for aging population with various adaptation techniques. Variability in acoustic characteristics among different speakers degrades speech recognition accuracy. In general, perceptual as well as acoustical variations exists among speakers, but variations are more pronounced between young and aged population. Deviation in voice source features between two age groups, affect the speech recognition performance. Existing automatic speech recognition algorithms demands large amount of training data with all variability to develop a robust speech recognition system. However, speaker normalization and adaptation techniques attempts to reduce inter-speaker or intra-speaker acoustic variability without having large amount of training data. Here, conventional acoustic model adaptation method e.g. vocal tract length normalization, maximum likelihood linear regression and/or maximum a posteriori are combined in the current study to improve recognition accuracy. Moreover, maximum mutual information estimation technique has been implemented in this study. (C) 2012 Elsevier B.V. All rights reserved.

引用

页码：335 / 343

页数：9

共 50 条

[1] Speaker adaptation in the philips system for large vocabulary continuous speech recognition
Thelen, E
Aubert, X
Beyerlein, P
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1035 - 1038
[2] Rapid Nonlinear Speaker Adaptation for Large-Vocabulary Continuous Speech Recognition
Roupakia, Zoi
Ragni, Anton
Gales, Mark
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1782 - 1785
[3] Supervised and unsupervised speaker adaptation in large vocabulary continuous speech recognition of Czech
Cerva, P
Nouza, J
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2005, 3658 : 203 - 210
[4] Rapid speaker adaptation for continuous speech recognition
Lu, Ping
Wu, Ji
Wang, Zuoying
Lu, Dajin
Qinghua Daxue Xuebao/Journal of Tsinghua University, 2002, 42 (07): : 977 - 980
[5] Experiments in speaker normalisation and adaptation for large vocabulary speech recognition
Pye, D
Woodland, PC
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1047 - 1050
[6] Speaker verification through large vocabulary continuous speech recognition
Newman, M
Gillick, L
Ito, Y
McAllaster, D
Peskin, B
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2419 - 2422
[7] Speaker selection training for large vocabulary continuous speech recognition
Huang, C
Chen, T
Chang, E
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 609 - 612
[8] Channel and speaker adaptation techniques for robust speech recognition
Chen, Jingdong
Yao, Lei
Huang, Taiyi
Shengxue Xuebao/Acta Acustica, 1998, 23 (06): : 537 - 544
[9] A speaker clustering algorithm for fast speaker adaptation in continuous speech recognition
Rodríguez, LJ
Torres, MI
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 433 - 440
[10] Speaker adaptation by modeling the speaker variation in a continuous speech recognition system
Strom, N
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 989 - 992

← 1 2 3 4 5 →