Turbo Processing for Speech Recognition

被引:2
|
作者
Moon, Todd K. [1 ,2 ]
Gunther, Jacob H. [1 ,2 ]
Broadus, Cortnie [3 ]
Hou, Wendy [4 ]
Nelson, Nils [3 ]
机构
[1] Utah State Univ, Informat Dynam Lab, Logan, UT 84322 USA
[2] Utah State Univ, Dept Elect & Comp Engn, Logan, UT 84322 USA
[3] Utah State Univ, Dept Math, Logan, UT 84322 USA
[4] Yale Univ, Dept Math, New Haven, CT 06511 USA
关键词
Human-machine interface; speech processing; turbo processing; HIDDEN MARKOV-MODELS; MAXIMIZATION;
D O I
10.1109/TCYB.2013.2247593
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech recognition is a classic example of a human/machine interface, typifying many of the difficulties and opportunities of human/machine interaction. In this paper, speech recognition is used as an example of applying turbo processing principles to the general problem of human/machine interface. Speech recognizers frequently involve a model representing phonemic information at a local level, followed by a language model representing information at a nonlocal level. This structure is analogous to the local (e. g., equalizer) and nonlocal (e. g., error correction decoding) elements common in digital communications. Drawing from the analogy of turbo processing for digital communications, turbo speech processing iteratively feeds back the output of the language model to be used as prior probabilities for the phonemic model. This analogy is developed here, and the performance of this turbo model is characterized by using an artificial language model. Using turbo processing, the relative error rate improves significantly, especially in high-noise settings.
引用
收藏
页码:83 / 91
页数:9
相关论文
共 50 条
  • [21] Affect Recognition from Speech
    Zhang, Li
    Francisco, Virginia
    ARTIFICIAL INTELLIGENCE IN EDUCATION: BUILDING LEARNING SYSTEMS THAT CARE: FROM KNOWLEDGE REPRESENTATION TO AFFECTIVE MODELLING, 2009, 200 : 683 - 685
  • [22] Automatic speech recognition: a survey
    Malik, Mishaim
    Malik, Muhammad Kamran
    Mehmood, Khawar
    Makhdoom, Imran
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 9411 - 9457
  • [23] Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition
    Ragni, Anton
    Gales, Mark J. F.
    Rose, Oliver
    Knill, Katherine M.
    Kastanos, Alexandros
    Li, Qiujia
    Ness, Preben M.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1319 - 1329
  • [24] Intelligent processing of stuttered speech
    Czyzewski, A
    Kaczmarek, A
    Kostek, B
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2003, 21 (02) : 143 - 171
  • [25] UNDERGRADUATE SPEECH PROCESSING AWARENESS
    Ressl, Marc
    Prendes, Jorge
    Saint-Nom, Roxana
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2773 - 2776
  • [26] Intelligent Processing of Stuttered Speech
    Andrzej Czyzewski
    Andrzej Kaczmarek
    Bozena Kostek
    Journal of Intelligent Information Systems, 2003, 21 : 143 - 171
  • [27] Joint evaluation of multiple speech patterns for speech recognition and training
    Nair, Nishanth Ulhas
    Sreenivas, T. V.
    COMPUTER SPEECH AND LANGUAGE, 2010, 24 (02): : 307 - 340
  • [28] Robust Speech Recognition with Speech Enhanced Deep Neural Networks
    Du, Jun
    Wang, Qing
    Gao, Tian
    Xu, Yong
    Dai, Lirong
    Lee, Chin-Hui
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 616 - 620
  • [29] A new database for Turkish speech recognition on mobile devices and initial speech recognition results using the database
    Buyuk, Osman
    PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2018, 24 (02): : 180 - 184
  • [30] SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
    Chang, Kai-Wei
    Wu, Haibin
    Wang, Yu-Kai
    Wu, Yuan-Kuei
    Shen, Hua
    Tseng, Wei-Cheng
    Kang, Iu-Thing
    Li, Shang-Wen
    Lee, Hung-Yi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3730 - 3744