Modelling asynchrony in automatic speech recognition using loosely coupled hidden Markov models

被引:0
作者
Nock, HJ [1 ]
Young, SJ [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
关键词
automatic speech recognition; pronunciation modelling; loosely coupled hidden Markov models; variational approximation;
D O I
暂无
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Hidden Markov models (HMMs) have been successful for modelling the dynamics of carefully dictated speech, but their performance degrades severely when used to model conversational speech. Since speech is produced by a system of loosely coupled articulators, stochastic models explicitly representing this parallelism may have advantages for automatic speech recognition (ASR), particularly when trying to model the phonological effects inherent in casual spontaneous speech. This paper presents a preliminary feasibility study of one such model class: loosely coupled HMMs. Exact model estimation and decoding is potentially expensive, so possible approximate algorithms are also discussed. Comparison of one particular loosely coupled model on an isolated word task suggests loosely coupled HMMs merit further investigation. An approximate algorithm giving performance which is almost always statistically indistinguishable from the exact algorithm is also identified, making more extensive research computationally feasible. (C) 2002 Cognitive Science Society, Inc. All rights reserved.
引用
收藏
页码:283 / 301
页数:19
相关论文
共 50 条
[21]   Amazigh Isolated-Word Speech Recognition System Using Hidden Markov Model Toolkit (HTK) [J].
Elouahabi, Safaa ;
Atounti, Mohamed ;
Bellouki, Mohamed .
2016 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY FOR ORGANIZATIONS DEVELOPMENT (IT4OD), 2016,
[22]   AUGMENTING AUTOMATIC SPEECH RECOGNITION MODELS WITH DISFLUENCY DETECTION [J].
Amann, Robin ;
Li, Zhaolin ;
Bruno, Barbara ;
Niehues, Jan .
2024 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2024, :224-231
[23]   JOINT LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING [J].
Bayer, Ali Orkan ;
Riccardi, Giuseppe .
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, :199-203
[24]   Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques [J].
Qin, Ying ;
Lee, Tan ;
Kong, Anthony Pak Hin ;
Law, Sam Po .
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[25]   Using asymmetric windows in automatic speech recognition [J].
Rozman, Robert ;
Kodek, Dusan M. .
SPEECH COMMUNICATION, 2007, 49 (04) :268-276
[26]   A hybrid system based on hidden Markov models and support vector machines with forward learning for phone recognition in venezuelan continuous speech [J].
Jabbour, Georges ;
Maldonado, Luciano ;
Sarmiento, Maria .
INGENIERIA UC, 2011, 18 (03) :7-16
[27]   Data mining for generating predictive models of automatic speech recognition [J].
Al-Zobaydi, AT ;
Al-Akaidi, MM ;
John, RI .
MESM 2005: 7th Middle East Simulation Multiconference, 2005, :147-150
[28]   Cochlear Mechanical Models used in Automatic Speech Recognition Tasks [J].
Oropeza Rodriguez, Jose Luis ;
Suarez Guerra, Sergio .
COMPUTACION Y SISTEMAS, 2019, 23 (03) :1099-1114
[29]   Neural Error Corrective Language Models for Automatic Speech Recognition [J].
Tanaka, Tomohiro ;
Masumura, Ryo ;
Masataki, Hirokazu ;
Aono, Yushi .
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :401-405
[30]   INCORPORATING MASK MODELLING FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION [J].
Koekueer, Muenevver ;
Jancovic, Peter .
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, :3929-3932