Modelling asynchrony in automatic speech recognition using loosely coupled hidden Markov models

被引:0
作者
Nock, HJ [1 ]
Young, SJ [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
关键词
automatic speech recognition; pronunciation modelling; loosely coupled hidden Markov models; variational approximation;
D O I
暂无
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Hidden Markov models (HMMs) have been successful for modelling the dynamics of carefully dictated speech, but their performance degrades severely when used to model conversational speech. Since speech is produced by a system of loosely coupled articulators, stochastic models explicitly representing this parallelism may have advantages for automatic speech recognition (ASR), particularly when trying to model the phonological effects inherent in casual spontaneous speech. This paper presents a preliminary feasibility study of one such model class: loosely coupled HMMs. Exact model estimation and decoding is potentially expensive, so possible approximate algorithms are also discussed. Comparison of one particular loosely coupled model on an isolated word task suggests loosely coupled HMMs merit further investigation. An approximate algorithm giving performance which is almost always statistically indistinguishable from the exact algorithm is also identified, making more extensive research computationally feasible. (C) 2002 Cognitive Science Society, Inc. All rights reserved.
引用
收藏
页码:283 / 301
页数:19
相关论文
共 50 条
[41]   Comparison Of Language Models Trained On Written Texts And Speech Transcripts In The Context Of Automatic Speech Recognition [J].
Dziadzio, Sebastian ;
Nabozny, Aleksandra ;
Smywinski-Pohl, Aleksander ;
Ziolko, Bartosz .
PROCEEDINGS OF THE 2015 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2015, 5 :193-197
[42]   Automatic Speech Recognition of Disordered Speech: Personalized models outperforming human listeners on short phrases [J].
Green, Jordan R. ;
MacDonald, Robert L. ;
Jiang, Pan-Pan ;
Cattiau, Julie ;
Heywood, Rus ;
Cave, Richard ;
Seaver, Katie ;
Ladewig, Marilyn A. ;
Tobin, Jimmy ;
Brenner, Michael P. ;
Nelson, Philip C. ;
Tomanek, Katrin .
INTERSPEECH 2021, 2021, :4778-4782
[43]   Variational Inference for Coupled Hidden Markov Models Applied to the Joint Detection of Copy Number Variations [J].
Wang, Xiaoqiang ;
Lebarbier, Emilie ;
Aubere, Julie ;
Robin, Stephane .
INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2019, 15 (01)
[44]   Tamil Speech Recognizer Using Hidden Markov Model for Question Answering System of Railways [J].
Vignesh, G. ;
Ganesh, S. Sankar .
ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 2, 2015, 325 :855-862
[45]   Using Privacy-Transformed Speech in the Automatic Speech Recognition Acoustic Model Training [J].
Salimbajevs, Askars .
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 :47-54
[46]   SEMANTIC WORD EMBEDDING NEURAL NETWORK LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION [J].
Audhkhasi, Kartik ;
Sethy, Abhinav ;
Ramabhadran, Bhuvana .
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, :5995-5999
[47]   Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition [J].
Masumura, Ryo ;
Asami, Taichi ;
Oba, Takanobu ;
Sakauchi, Sumitaka ;
Ito, Akinori .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (12) :2557-2567
[48]   Enhancing Automatic Speech Recognition: Effects of Semantic Audio Filtering on Models Performance [J].
Perezhohin, Yuriy ;
Santos, Tiago ;
Costa, Victor ;
Peres, Fernando ;
Castelli, Mauro .
IEEE ACCESS, 2024, 12 :155136-155150
[49]   Automatic speech recognition using a predictive echo state network classifier [J].
Skowronski, Mark D. ;
Harris, John G. .
NEURAL NETWORKS, 2007, 20 (03) :414-423
[50]   Cross-Lingual Automatic Speech Recognition Using Tandem Features [J].
Lal, Partha ;
King, Simon .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (12) :2506-2515