Turbo Processing for Speech Recognition

被引:2
|
作者
Moon, Todd K. [1 ,2 ]
Gunther, Jacob H. [1 ,2 ]
Broadus, Cortnie [3 ]
Hou, Wendy [4 ]
Nelson, Nils [3 ]
机构
[1] Utah State Univ, Informat Dynam Lab, Logan, UT 84322 USA
[2] Utah State Univ, Dept Elect & Comp Engn, Logan, UT 84322 USA
[3] Utah State Univ, Dept Math, Logan, UT 84322 USA
[4] Yale Univ, Dept Math, New Haven, CT 06511 USA
关键词
Human-machine interface; speech processing; turbo processing; HIDDEN MARKOV-MODELS; MAXIMIZATION;
D O I
10.1109/TCYB.2013.2247593
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech recognition is a classic example of a human/machine interface, typifying many of the difficulties and opportunities of human/machine interaction. In this paper, speech recognition is used as an example of applying turbo processing principles to the general problem of human/machine interface. Speech recognizers frequently involve a model representing phonemic information at a local level, followed by a language model representing information at a nonlocal level. This structure is analogous to the local (e. g., equalizer) and nonlocal (e. g., error correction decoding) elements common in digital communications. Drawing from the analogy of turbo processing for digital communications, turbo speech processing iteratively feeds back the output of the language model to be used as prior probabilities for the phonemic model. This analogy is developed here, and the performance of this turbo model is characterized by using an artificial language model. Using turbo processing, the relative error rate improves significantly, especially in high-noise settings.
引用
收藏
页码:83 / 91
页数:9
相关论文
共 50 条
  • [41] Time frequency representation for speech recognition
    Amsalem, Avishay
    Shallom, Ilan D.
    2006 INTERNATIONAL CONFERENCE ON INFORMATION AND TECHNOLOGY: RESEARCH AND EDUCATION, 2006, : 99 - +
  • [42] Counterfactually Fair Automatic Speech Recognition
    Sari, Leda
    Hasegawa-Johnson, Mark
    Yoo, Chang D.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3515 - 3525
  • [43] Automatic emotion recognition by the speech signal
    Schuller, B
    Lang, M
    Rigoll, G
    6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IX, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING II, 2002, : 367 - 372
  • [44] Acoustic Model Adaptation for Speech Recognition
    Shinoda, Koichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2348 - 2362
  • [45] A novel duration model for speech recognition
    Yuan, Lichi
    Wan, Changxuan
    PROCEEDINGS OF THE FOURTH IASTED INTERNATIONAL CONFERENCE ON CIRCUITS, SIGNALS, AND SYSTEMS, 2006, : 279 - +
  • [46] On Continuous Speech Recognition of Indian English
    Jin, Xin
    Zhang, Keliang
    Huang, Xian
    Miao, Min
    2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [47] SYNTACTIC AND SEMANTIC POSTPROCESSING FOR SPEECH RECOGNITION
    KRALLMANN, H
    MARZI, R
    DECISION SUPPORT SYSTEMS, 1991, 7 (03) : 253 - 261
  • [48] SPEECH RECOGNITION IN THE NOISY CAR ENVIRONMENT
    RUEHL, HW
    DOBLER, S
    WEITH, J
    MEYER, P
    NOLL, A
    HAMER, HH
    PIOTROWSKI, H
    SPEECH COMMUNICATION, 1991, 10 (01) : 11 - 22
  • [49] Diacritics Effect on Arabic Speech Recognition
    Abed, Sa'ed
    Alshayeji, Mohammad
    Sultan, Sari
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (11) : 9043 - 9056
  • [50] Towards automatic recognition of emotion in speech
    Razak, AA
    Yusof, MHM
    Komiya, R
    PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2003, : 548 - 551