An Empirical Study on Punctuation Restoration for English, Mandarin, and Code-Switching Speech

被引:0
|
作者
Liu, Changsong [1 ]
Thi Nga Ho [1 ]
Chng, Eng Siong [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
来源
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II | 2023年 / 13996卷
基金
新加坡国家研究基金会;
关键词
Punctuation Restoration; Multilingual; Codeswitching; Automatic Speech Recognition; Singaporean Speech;
D O I
10.1007/978-981-99-5837-5_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Punctuation restoration is a crucial task in enriching automated transcripts produced by Automatic Speech Recognition (ASR) systems. This paper presents an empirical study on the impact of employing different data acquisition and training strategies on the performance of punctuation restoration models for multilingual and codeswitching speech. The study focuses on two of the most popular Singaporean spoken languages, namely English and Mandarin in both monolingual and codeswitching forms. Specifically, we experimented with in-domain and out-of-domain evaluation for multilingual and codeswitching speech. Subsequently, we enlarge the training data by sampling the codeswitching corpus by reordering the conversational transcripts. We also proposed to ensemble the predicting models by averaging saved model checkpoints instead of using the last checkpoint to improve the model performance. The model employs a slot-filling approach to predict the punctuation at each word boundary. Through utilizing and enlarging the available datasets as well as ensemble different model checkpoints, the result reaches an F1 score of 76.5% and 79.5% respectively for monolingual and codeswitch test sets, which exceeds the state-of-art performance. This investigation contributes to the existing literature on punctuation restoration for multilingual and code-switch speech. It offers insights into the importance of averaging model checkpoints in improving the final model's performance. Source codes and trained models are published on our Github's repo for future replications and usage.(https://github.com/charlieliu331/Punctuation_Restoration)
引用
收藏
页码:286 / 296
页数:11
相关论文
共 50 条
  • [1] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Wei, Shuang
    Lian, Jie
    Li, Yijie
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [2] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Yanhua Long
    Shuang Wei
    Jie Lian
    Yijie Li
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [3] INVESTIGATING END-TO-END SPEECH RECOGNITION FOR MANDARIN-ENGLISH CODE-SWITCHING
    Shan, Changhao
    Weng, Chao
    Wang, Guangsen
    Su, Dan
    Luo, Min
    Yu, Dong
    Xie, Lei
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6056 - 6060
  • [4] NON-AUTOREGRESSIVE MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Chuang, Shun-Po
    Chang, Heng-Jui
    Huang, Sung-Feng
    Lee, Hung-yi
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 465 - 472
  • [5] Acoustic data augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Li, Yijie
    Zhang, Qiaozheng
    Wei, Shuang
    Ye, Hong
    Yang, Jichen
    APPLIED ACOUSTICS, 2020, 161
  • [6] ADDRESSING ACCENT MISMATCH IN MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Tan, Zhili
    Fan, Xinghua
    Zhu, Hui
    Lin, Ed
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8259 - 8263
  • [7] On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
    Zeng, Zhiping
    Khassanov, Yerbolat
    Van Tung Pham
    Xu, Haihua
    Chng, Eng Siong
    Li, Haizhou
    INTERSPEECH 2019, 2019, : 2165 - 2169
  • [8] Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition
    Guo, Pengcheng
    Xu, Haihua
    Xie, Lei
    Chng, Eng Siong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1928 - 1932
  • [9] Monolingual Data Selection Analysis for English-Mandarin Hybrid Code-switching Speech Recognition
    Zhang, Haobo
    Xu, Haihua
    Van Tung Pham
    Huang, Hao
    Chng, Eng Siong
    INTERSPEECH 2020, 2020, : 2392 - 2396
  • [10] AN EVALUATION BENCHMARK FOR AUTOMATIC SPEECH RECOGNITION OF GERMAN-ENGLISH CODE-SWITCHING
    Khosravani, Abbas
    Garner, Philip N.
    Lazaridis, Alexandros
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 811 - 816