Experiments in Character-level Neural Network Models for Punctuation

被引:7
|
作者
Gale, William [1 ]
Parthasarathy, Sarangarajan [2 ]
机构
[1] Univ Adelaide, Adelaide, SA, Australia
[2] Microsoft, Redmond, WA USA
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
关键词
speech recognition; punctuation prediction; neural networks;
D O I
10.21437/Interspeech.2017-1710
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore character-level neural network models for inferring punctuation from text-only input. Punctuation inference is treated as a sequence tagging problem where the input is a sequence of un-punctuated characters, and the output is a corresponding sequence of punctuation tags. We experiment with six architectures, all of which use a long short-term memory (LSTM) network for sequence modeling. They differ in the way the context and lookahead for a given character is derived: from simple character embedding and delayed output to enable lookahead, to complex convolutional neural networks (CNN) to capture context. We demonstrate that the accuracy of proposed character-level models are competitive with the accuracy of a state-of-the-art word-level Conditional Random Field (CRF) baseline with carefully crafted features.
引用
收藏
页码:2794 / 2798
页数:5
相关论文
共 50 条
  • [1] Character-Level Neural Language Modelling in the Clinical Domain
    Kreuzthaler, Markus
    Oleynik, Michel
    Schulz, Stefan
    DIGITAL PERSONALIZED HEALTH AND MEDICINE, 2020, 270 : 83 - 87
  • [2] Character-level HyperNetworks for Hate Speech Detection
    Wullach, Tomer
    Adler, Amir
    Minkov, Einat
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 205
  • [3] CharCaps: Character-Level Text Classification Using Capsule Networks
    Wu, Yujia
    Guo, Xin
    Zhan, Kangning
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT II, 2023, 14087 : 187 - 198
  • [4] Crowdsourcing the character of a place: Character-level convolutional networks for multilingual geographic text classification
    Adams, Benjamin
    McKenzie, Grant
    TRANSACTIONS IN GIS, 2018, 22 (02) : 394 - 408
  • [5] Neural network models of learning and categorization in multigame experiments
    Marchiori, Davide
    Warglien, Massimo
    FRONTIERS IN NEUROSCIENCE, 2011, 5
  • [6] Cross-domain Speech Recognition with Unsupervised Character-level Distribution Matching
    Hou, Wenxin
    Wang, Jindong
    Tan, Xu
    Qin, Tao
    Shinozaki, Takahiro
    INTERSPEECH 2021, 2021, : 3425 - 3429
  • [7] A 43 Language Multilingual Punctuation Prediction Neural Network Model
    Li, Xinxing
    Lin, Edward
    INTERSPEECH 2020, 2020, : 1067 - 1071
  • [8] Facial Emotion Expression Corpora for Training Game Character Neural Network Models
    Schiffer, Sheldon
    Zhang, Samantha
    Levine, Max
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (HUCAPP), VOL 2, 2022, : 197 - 208
  • [9] A comparison study between MLP and Convolutional Neural Network models for character recognition
    Ben Driss, S.
    Soua, M.
    Kachouri, R.
    Akil, M.
    REAL-TIME IMAGE AND VIDEO PROCESSING 2017, 2017, 10223
  • [10] Overcoming Data Scarcity in Speaker Identification: Dataset Augmentation with Synthetic MFCCs via Character-level RNN
    Bird, Jordan J.
    Faria, Diego R.
    Premebida, Cristiano
    Ekart, Aniko
    Ayrosa, Pedro P. S.
    2020 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC 2020), 2020, : 146 - 151