Multi-Task Self-Supervised Learning for Disfluency Detection

被引:0
|
作者
Wang, Shaolei [1 ]
Che, Wanxiang [1 ]
Liu, Qi [2 ]
Qin, Pengda [3 ]
Liu, Ting [1 ]
Wang, William Yang [4 ]
机构
[1] Harbin Inst Technol, Ctr Social Comp & Informat Retrieval, Harbin, Heilongjiang, Peoples R China
[2] Univ Oxford, Oxford, England
[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[4] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing approaches to disfluency detection heavily rely on human-annotated data, which is expensive to obtain in practice. To tackle the training data bottleneck, we investigate methods for combining multiple self-supervised tasksi.e., supervised tasks where data can be collected without manual labeling. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled news data, and propose two self-supervised pre-training tasks: (i) tagging task to detect the added noisy words. (ii) sentence classification to distinguish original sentences from grammatically-incorrect sentences. We then combine these two tasks to jointly train a network. The pre-trained network is then fine-tuned using human-annotated disfluency detection training data. Experimental results on the commonly used English Switchboard test set show that our approach can achieve competitive performance compared to the previous systems (trained using the full dataset) by using less than 1% (1000 sentences) of the training data. Our method trained on the full dataset significantly outperforms previous methods, reducing the error by 21% on English Switchboard.
引用
收藏
页码:9193 / 9200
页数:8
相关论文
共 50 条
  • [31] ProteinGLUE multi-task benchmark suite for self-supervised protein modeling
    Henriette Capel
    Robin Weiler
    Maurits Dijkstra
    Reinier Vleugels
    Peter Bloem
    K. Anton Feenstra
    Scientific Reports, 12
  • [32] GMSS: Graph-Based Multi-Task Self-Supervised Learning for EEG Emotion Recognition
    Li, Yang
    Chen, Ji
    Li, Fu
    Fu, Boxun
    Wu, Hao
    Ji, Youshuo
    Zhou, Yijin
    Niu, Yi
    Shi, Guangming
    Zheng, Wenming
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 2512 - 2525
  • [33] ProteinGLUE multi-task benchmark suite for self-supervised protein modeling
    Capel, Henriette
    Weiler, Robin
    Dijkstra, Maurits
    Vleugels, Reinier
    Bloem, Peter
    Feenstra, K. Anton
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [34] Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery
    Ren, Zhongzheng
    Lee, Yong Jae
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 762 - 771
  • [35] MULTI-TASK SELF-SUPERVISED PRE-TRAINING FOR MUSIC CLASSIFICATION
    Wu, Ho-Hsiang
    Kao, Chieh-Chi
    Tang, Qingming
    Sun, Ming
    McFee, Brian
    Bello, Juan Pablo
    Wang, Chao
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 556 - 560
  • [36] Self-supervised Adversarial Multi-task Learning for Vocoder-based Monaural Speech Enhancement
    Du, Zhihao
    Lei, Ming
    Han, Jiqing
    Zhang, Shiliang
    INTERSPEECH 2020, 2020, : 3271 - 3275
  • [37] MSLSNet: A combination of multi-task self-supervised learning and Swin transformer network for face and keypoint detection in thermal images
    Aghaomidi, Poorya
    Mirzaei, Fatemeh
    Bahmani, Zahra
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 268
  • [38] Pano-SfMLearner: Self-Supervised Multi-Task Learning of Depth and Semantics in Panoramic Videos
    Liu, Mengyi
    Wang, Shuhui
    Guo, Yulan
    He, Yuan
    Xue, Hui
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 832 - 836
  • [39] Semi-supervised Hotspot Detection with Self-paced Multi-Task Learning
    Chen, Ying
    Lin, Yibo
    Gai, Tianyang
    Su, Yajuan
    Wei, Yayi
    Pan, David Z.
    24TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC 2019), 2019, : 420 - 425
  • [40] Portfolio management using online reinforcement learning with adaptive exploration and Multi-task self-supervised representation
    Sang, Chuan-Yun
    Huang, Szu-Hao
    Chen, Chiao-Ting
    Chang, Heng-Ta
    APPLIED SOFT COMPUTING, 2025, 172