Discriminative Self-training for Punctuation Prediction

被引:3
作者
Chen, Qian [1 ]
Wang, Wen [1 ]
Chen, Mengzhe [1 ]
Zhang, Qinglin [1 ]
机构
[1] Alibaba Grp, Speech Lab, Hangzhou, Peoples R China
来源
INTERSPEECH 2021 | 2021年
关键词
punctuation prediction; self-training; label smoothing; Transformer; BERT;
D O I
10.21437/Interspeech.2021-246
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Punctuation prediction for automatic speech recognition (ASR) output transcripts plays a crucial role for improving the readability of the ASR transcripts and for improving the performance of downstream natural language processing applications. However, achieving good performance on punctuation prediction often requires large amounts of labeled speech transcripts, which is expensive and laborious. In this paper, we propose a Discriminative Self-Training approach with weighted loss and discriminative label smoothing to exploit unlabeled speech transcripts. Experimental results on the English IWSLT2011 benchmark test set and an internal Chinese spoken language dataset demonstrate that the proposed approach achieves significant improvement on punctuation prediction accuracy over strong baselines including BERT, RoBERTa, and ELECTRA models. The proposed Discriminative Self-Training approach outperforms the vanilla self-training approach. We establish a new state-of-the-art (SOTA) on the IWSLT2011 test set, outperforming the current SOTA model by 1.3% absolute gain on F-1.
引用
收藏
页码:771 / 775
页数:5
相关论文
共 40 条
  • [1] Alam T., 2020, P 6 WORKSH NOIS US G, P132
  • [2] Beeferman D, 1998, INT CONF ACOUST SPEE, P689, DOI 10.1109/ICASSP.1998.675358
  • [3] Che XY, 2016, LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P654
  • [4] Chen Q, 2020, INT CONF ACOUST SPEE, P8069, DOI [10.1109/ICASSP40776.2020.9053159, 10.1109/icassp40776.2020.9053159]
  • [5] Cho Eunah, 2015, IWSLT
  • [6] Cho Eunah., 2012, Proceedings of the 9th International Workshop on Spoken Language Translation: Papers, P252
  • [7] Christensen H., 2001, ISCA TUT RES WORKSH
  • [8] Cieri Christopher, 2005, Fisher English Training Part 2, Transcripts
  • [9] ELECTRA: PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS
    Clark, Kevin
    Luong, Minh-Thang
    Le, Quoc V.
    Manning, Christopher D.
    [J]. INFORMATION SYSTEMS RESEARCH, 2020,
  • [10] Courtland M, 2020, 17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), P272