Thinking about the present and future of the complex speech recognition

被引：0

作者：

Vicsi, Klara ^{[1
]}

机构：

[1] Budapest Univ Technol & Econ, Dept Telecommun & Mediainformat, Lab Speech Acoust, Budapest, Hungary

来源：

3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012) | 2012年

关键词：

component; speech recognition; speech to text transformation system; multi-modal speech processing; multi-stream modelling; FEATURES;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A critical point of the most cognitive info-communication systems is the state of the development of speech recognition technology. The paper gives a short introduction of the principles of this speech recognition technology today. It highlights the fact that these systems in the market are only speech-to-text transformers giving only a word chain at the output, where the speech prosody, speech emotion, speech style and more other information are not involved. Many uncertainties exist in this operational system. Some up to date research tendencies, mostly the parallel processing are introduced aiming to increase the efficiencies of the recognition. At the end, research agenda of META NET are shortly introduced for Multilingual Europe in 2020.

引用

页码：371 / 376

页数：6

共 50 条

[31] A Survey of Multilingual Models for Automatic Speech Recognition
Yadav, Hemant
Sitaram, Sunayana
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5071 - 5079
[32] Parallel speech recognition
Phillips, S
Rogers, A
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1999, 27 (04) : 257 - 288
[33] An introduction to speech recognition
DeBleecker, MR
GLOBAL VISION, 1996, : 251 - 254
[34] Parallel Speech Recognition
Steven Phillips
Anne Rogers
International Journal of Parallel Programming, 1999, 27 : 257 - 288
[35] Speech Disorder Malay Speech Recognition System
Al-Haddad, S. A. R.
SENSORS, SIGNALS, VISUALIZATION, IMAGING, SIMULATION AND MATERIALS, 2009, : 69 - 75
[36] SPEECH ENHANCEMENT FOR TELEPHONY NAME SPEECH RECOGNITION
You, Chang Huai
Rahardja, Susanto
Li, Haizhou
2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 973 - 976
[37] MODIFICATION ON LSA SPEECH ENHANCEMENT FOR SPEECH RECOGNITION
You, Chang Huai
Ma, Bin
Ni, Chongjia
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5475 - 5479
[38] Speech and Speech Recognition during Dictation Corrections
Vertanen, Keith
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1890 - 1893
[39] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
Fayek, Haytham M.
Lech, Margaret
Cavedon, Lawrence
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
[40] Discriminative Named Entity Recognition of Speech Data using Speech Recognition Confidence
Sudoh, Katsuhito
Tsukada, Hajime
Isozaki, Hideki
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 337 - 340

← 1 2 3 4 5 →