LSTM for Punctuation Restoration in Speech Transcripts

被引:0
作者
Tilk, Ottokar [1 ]
Alumae, Tanel [1 ]
机构
[1] Tallinn Univ Technol, Inst Cybernet, EE-19086 Tallinn, Estonia
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
neural network; punctuation restoration;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The output of automatic speech recognition systems is generally an unpunctuated stream of words which is hard to process for both humans and machines. We present a two-stage recurrent neural network based model using long short-term memory units to restore punctuation in speech transcripts. In the first stage, textual features are learned on a large text corpus. The second stage combines textual features with pause durations and adapts the model to speech domain. Our approach reduces the number of punctuation errors by up to 16.9% when compared to a decision tree that combines hidden-event language model posteriors with inter-word pause information, having largest improvements in period restoration.
引用
收藏
页码:683 / 687
页数:5
相关论文
共 25 条
  • [1] Alumae T., 2011, INT 2011 FLOR IT, P3335
  • [2] Alumae T., 2014, SLTU 2014
  • [3] [Anonymous], SPECOM 2004
  • [4] [Anonymous], INTERSPEECH 2014
  • [5] [Anonymous], EMNLP 2010
  • [6] [Anonymous], 1997, Neural Computation, DOI DOI 10.1162/NECO.1997.9.8.1735
  • [7] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
    BENGIO, Y
    SIMARD, P
    FRASCONI, P
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02): : 157 - 166
  • [8] A neural probabilistic language model
    Bengio, Y
    Ducharme, R
    Vincent, P
    Jauvin, C
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) : 1137 - 1155
  • [9] Christensen H., 2001, Prosody and Speech Recognition
  • [10] Duchi J, 2011, J MACH LEARN RES, V12, P2121