Turbo Processing for Speech Recognition

被引：2

作者：

Moon, Todd K. ^{[1
,2
]}

Gunther, Jacob H. ^{[1
,2
]}

Broadus, Cortnie ^{[3
]}

Hou, Wendy ^{[4
]}

Nelson, Nils ^{[3
]}

机构：

[1] Utah State Univ, Informat Dynam Lab, Logan, UT 84322 USA

[2] Utah State Univ, Dept Elect & Comp Engn, Logan, UT 84322 USA

[3] Utah State Univ, Dept Math, Logan, UT 84322 USA

[4] Yale Univ, Dept Math, New Haven, CT 06511 USA

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2014年 / 44卷 / 01期

关键词：

Human-machine interface; speech processing; turbo processing; HIDDEN MARKOV-MODELS; MAXIMIZATION;

D O I：

10.1109/TCYB.2013.2247593

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Speech recognition is a classic example of a human/machine interface, typifying many of the difficulties and opportunities of human/machine interaction. In this paper, speech recognition is used as an example of applying turbo processing principles to the general problem of human/machine interface. Speech recognizers frequently involve a model representing phonemic information at a local level, followed by a language model representing information at a nonlocal level. This structure is analogous to the local (e. g., equalizer) and nonlocal (e. g., error correction decoding) elements common in digital communications. Drawing from the analogy of turbo processing for digital communications, turbo speech processing iteratively feeds back the output of the language model to be used as prior probabilities for the phonemic model. This analogy is developed here, and the performance of this turbo model is characterized by using an artificial language model. Using turbo processing, the relative error rate improves significantly, especially in high-noise settings.

引用

页码：83 / 91

页数：9

共 50 条

[21] Affect Recognition from Speech
Zhang, Li
Francisco, Virginia
ARTIFICIAL INTELLIGENCE IN EDUCATION: BUILDING LEARNING SYSTEMS THAT CARE: FROM KNOWLEDGE REPRESENTATION TO AFFECTIVE MODELLING, 2009, 200 : 683 - 685
[22] Automatic speech recognition: a survey
Malik, Mishaim
Malik, Muhammad Kamran
Mehmood, Khawar
Makhdoom, Imran
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 9411 - 9457
[23] Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition
Ragni, Anton
Gales, Mark J. F.
Rose, Oliver
Knill, Katherine M.
Kastanos, Alexandros
Li, Qiujia
Ness, Preben M.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1319 - 1329
[24] Intelligent processing of stuttered speech
Czyzewski, A
Kaczmarek, A
Kostek, B
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2003, 21 (02) : 143 - 171
[25] UNDERGRADUATE SPEECH PROCESSING AWARENESS
Ressl, Marc
Prendes, Jorge
Saint-Nom, Roxana
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2773 - 2776
[26] Intelligent Processing of Stuttered Speech
Andrzej Czyzewski
Andrzej Kaczmarek
Bozena Kostek
Journal of Intelligent Information Systems, 2003, 21 : 143 - 171
[27] Joint evaluation of multiple speech patterns for speech recognition and training
Nair, Nishanth Ulhas
Sreenivas, T. V.
COMPUTER SPEECH AND LANGUAGE, 2010, 24 (02): : 307 - 340
[28] Robust Speech Recognition with Speech Enhanced Deep Neural Networks
Du, Jun
Wang, Qing
Gao, Tian
Xu, Yong
Dai, Lirong
Lee, Chin-Hui
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 616 - 620
[29] A new database for Turkish speech recognition on mobile devices and initial speech recognition results using the database
Buyuk, Osman
PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2018, 24 (02): : 180 - 184
[30] SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Chang, Kai-Wei
Wu, Haibin
Wang, Yu-Kai
Wu, Yuan-Kuei
Shen, Hua
Tseng, Wei-Cheng
Kang, Iu-Thing
Li, Shang-Wen
Lee, Hung-Yi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3730 - 3744

← 1 2 3 4 5 →