Evolution of the performance of automatic speech recognition algorithms in transcribing conversational telephone speech

被引:0
|
作者
Padmanabhan, M [1 ]
Saon, G [1 ]
Zweig, G [1 ]
Huang, J [1 ]
Kingsbury, B [1 ]
Mangu, L [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
来源
IMTC/2001: PROCEEDINGS OF THE 18TH IEEE INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE, VOLS 1-3: REDISCOVERING MEASUREMENT IN THE AGE OF INFORMATICS | 2001年
关键词
speech recognition; spontaneous speech; telephone speech; discriminant transforms; boosting; consensus; formant frequencies; spectral peaks;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research in the speech recognition speech-to-text conversion) area has been underway for a couple of decades, and a great deal of progress has been made in reducing the word error rate (WER). In this paper, we attempt to summarize the state of the art in speech recognition algorithms. The algorithms we describe span the areas of lexicon design, feature extraction, classifier design, combination of hypotheses, and speaker adaptation of acoustic models. We will benchmark the algorithms on two main sources of speech, the first being Voicemail (conversational telephone speech from a single speaker) and the second being Switchboard (conversational telephone speech between two speakers). We also present the results of some cross-domain experiments which highlight the "brittleness" of speech recognition systems today and illustrates the need to focus research effort on improving cross-domain performance.
引用
收藏
页码:1926 / 1931
页数:4
相关论文
共 50 条
  • [21] Automatic speech recognition systems
    Catariov, A
    Information Technologies 2004, 2004, 5822 : 83 - 93
  • [22] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
  • [23] An automatic speech recognition system for spontaneous Punjabi speech corpus
    Kumar Y.
    Singh N.
    International Journal of Speech Technology, 2017, 20 (2) : 297 - 303
  • [24] Automatic Speech Recognition with Deep Neural Networks for Impaired Speech
    Espana-Bonet, Cristina
    Fonollosa, Jose A. R.
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2016, 2016, 10077 : 97 - 107
  • [25] Automatic speech segmentation in syllable centric speech recognition system
    Panda S.P.
    Nayak A.K.
    International Journal of Speech Technology, 2016, 19 (1) : 9 - 18
  • [26] INVESTIGATING TECHNIQUES FOR LOW RESOURCE CONVERSATIONAL SPEECH RECOGNITION
    Laurent, Antoine
    Fraga-Silva, Thiago
    Lamel, Lori
    Gauvain, Jean-Luc
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5975 - 5979
  • [27] Analysis of HMM Temporal Evolution for Automatic Speech Recognition and Utterance Verification
    Casar, Marta
    Fonollosa, Jose A. R.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 613 - 616
  • [28] Summarization of Spontaneous Speech using Automatic Speech Recognition and a Speech Prosody based Tokenizer
    Szaszak, Gyorgy
    Tundik, Mate Akos
    Beke, Andras
    KDIR: PROCEEDINGS OF THE 8TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL. 1, 2016, : 221 - 227
  • [29] Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms
    Maganti, Hari Krishna
    Motlicek, Petr
    Gatica-Perez, Daniel
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1037 - +
  • [30] EVALUATING VAD FOR AUTOMATIC SPEECH RECOGNITION
    Tong, Sibo
    Chen, Nanxin
    Qian, Yanmin
    Yu, Kai
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 2308 - 2314