Thinking about the present and future of the complex speech recognition

被引:0
作者
Vicsi, Klara [1 ]
机构
[1] Budapest Univ Technol & Econ, Dept Telecommun & Mediainformat, Lab Speech Acoust, Budapest, Hungary
来源
3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012) | 2012年
关键词
component; speech recognition; speech to text transformation system; multi-modal speech processing; multi-stream modelling; FEATURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A critical point of the most cognitive info-communication systems is the state of the development of speech recognition technology. The paper gives a short introduction of the principles of this speech recognition technology today. It highlights the fact that these systems in the market are only speech-to-text transformers giving only a word chain at the output, where the speech prosody, speech emotion, speech style and more other information are not involved. Many uncertainties exist in this operational system. Some up to date research tendencies, mostly the parallel processing are introduced aiming to increase the efficiencies of the recognition. At the end, research agenda of META NET are shortly introduced for Multilingual Europe in 2020.
引用
收藏
页码:371 / 376
页数:6
相关论文
共 50 条
  • [31] A Survey of Multilingual Models for Automatic Speech Recognition
    Yadav, Hemant
    Sitaram, Sunayana
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5071 - 5079
  • [32] Parallel speech recognition
    Phillips, S
    Rogers, A
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1999, 27 (04) : 257 - 288
  • [33] An introduction to speech recognition
    DeBleecker, MR
    GLOBAL VISION, 1996, : 251 - 254
  • [34] Parallel Speech Recognition
    Steven Phillips
    Anne Rogers
    International Journal of Parallel Programming, 1999, 27 : 257 - 288
  • [35] Speech Disorder Malay Speech Recognition System
    Al-Haddad, S. A. R.
    SENSORS, SIGNALS, VISUALIZATION, IMAGING, SIMULATION AND MATERIALS, 2009, : 69 - 75
  • [36] SPEECH ENHANCEMENT FOR TELEPHONY NAME SPEECH RECOGNITION
    You, Chang Huai
    Rahardja, Susanto
    Li, Haizhou
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 973 - 976
  • [37] MODIFICATION ON LSA SPEECH ENHANCEMENT FOR SPEECH RECOGNITION
    You, Chang Huai
    Ma, Bin
    Ni, Chongjia
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5475 - 5479
  • [38] Speech and Speech Recognition during Dictation Corrections
    Vertanen, Keith
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1890 - 1893
  • [39] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
  • [40] Discriminative Named Entity Recognition of Speech Data using Speech Recognition Confidence
    Sudoh, Katsuhito
    Tsukada, Hajime
    Isozaki, Hideki
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 337 - 340