Focal Loss for Punctuation Prediction

被引:9
|
作者
Yi, Jiangyan [1 ]
Tao, Jianhua [1 ,2 ,3 ]
Tian, Zhengkun [1 ,3 ]
Bai, Ye [1 ,3 ]
Fan, Cunhang [1 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[2] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
来源
INTERSPEECH 2020 | 2020年
基金
中国国家自然科学基金;
关键词
focal loss; class imbalance; punctuation prediction; speech recognition; SPEECH RECOGNITION; MODELS;
D O I
10.21437/Interspeech.2020-1638
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Many approaches have been proposed to predict punctuation marks. Previous results demonstrate that these methods are effective. However, there still exists class imbalance problem during training. Most of the classes in the training set for punctuation prediction are non-punctuation marks. This will affect the performance of punctuation prediction tasks. Therefore, this paper uses a focal loss to alleviate this issue. The focal loss can down-weight easy examples and focus training on a sparse set of hard examples. Experiments are conducted on IWSLT2011 datasets. The results show that the punctuation predicting models trained with a focal loss obtain performance improvement over that trained with a cross entropy loss by up to 2.7% absolute overall F-1-score on test set. The proposed model also outperforms previous state-of-the-art models.
引用
收藏
页码:721 / 725
页数:5
相关论文
共 50 条
  • [1] Punctuation Prediction Model for Conversational Speech
    Zelasko, Piotr
    Szymanski, Piotr
    Mizgajski, Jan
    Szymczak, Adrian
    Carmiel, Yishay
    Dehak, Najim
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2633 - 2637
  • [2] Decoding-Time Prediction of Non-Verbalized Punctuation
    Deoras, Anoop
    Fritsch, Juergen
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1449 - +
  • [3] Leveraging Prosody for Punctuation Prediction of Spontaneous Speech
    Cho, Jenny Yeonjin
    Ng, Sara
    Trang Tran
    Ostendorf, Mari
    INTERSPEECH 2022, 2022, : 555 - 559
  • [4] Capitalization and punctuation restoration: a survey
    Pais, Vasile
    Tufis, Dan
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (03) : 1681 - 1722
  • [5] Joint prediction of punctuation and disfluency in speech transcripts
    Lin, Binghuai
    Wang, Liyuan
    INTERSPEECH 2020, 2020, : 716 - 720
  • [6] FullStop: punctuation and segmentation prediction for Dutch with transformers
    Vandeghinste, Vincent
    Guhr, Oliver
    LANGUAGE RESOURCES AND EVALUATION, 2024, 58 (04) : 1335 - 1354
  • [7] Multimodal Semi-supervised Learning Framework for Punctuation Prediction in Conversational Speech
    Sunkara, Monica
    Ronanki, Srikanth
    Bekal, Dhanush
    Bodapati, Sravan
    Kirchhoff, Katrin
    INTERSPEECH 2020, 2020, : 4911 - 4915
  • [8] Investigating LSTM for Punctuation Prediction
    Xu, Kaituo
    Xie, Lei
    Yao, Kaisheng
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [9] Punctuation Prediction in Bangla Text
    Rahman, Habibur
    Rahin, Rezwan Shahrior
    Mahbub, Araf Mohammad
    Islam, Adnanul
    Mukta, Saddam Hossain
    Rahman, Mahbubur
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (03)
  • [10] Transfer knowledge for punctuation prediction via adversarial training
    Yi, Jiangyan
    Tao, Jianhua
    Bai, Ye
    Tian, Zhengkun
    Fan, Cunhang
    SPEECH COMMUNICATION, 2023, 149 : 1 - 10