Multi-Task Self-Supervised Learning for Disfluency Detection

被引:0
|
作者
Wang, Shaolei [1 ]
Che, Wanxiang [1 ]
Liu, Qi [2 ]
Qin, Pengda [3 ]
Liu, Ting [1 ]
Wang, William Yang [4 ]
机构
[1] Harbin Inst Technol, Ctr Social Comp & Informat Retrieval, Harbin, Heilongjiang, Peoples R China
[2] Univ Oxford, Oxford, England
[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[4] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing approaches to disfluency detection heavily rely on human-annotated data, which is expensive to obtain in practice. To tackle the training data bottleneck, we investigate methods for combining multiple self-supervised tasksi.e., supervised tasks where data can be collected without manual labeling. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled news data, and propose two self-supervised pre-training tasks: (i) tagging task to detect the added noisy words. (ii) sentence classification to distinguish original sentences from grammatically-incorrect sentences. We then combine these two tasks to jointly train a network. The pre-trained network is then fine-tuned using human-annotated disfluency detection training data. Experimental results on the commonly used English Switchboard test set show that our approach can achieve competitive performance compared to the previous systems (trained using the full dataset) by using less than 1% (1000 sentences) of the training data. Our method trained on the full dataset significantly outperforms previous methods, reducing the error by 21% on English Switchboard.
引用
收藏
页码:9193 / 9200
页数:8
相关论文
共 50 条
  • [21] MULTI-TASK SELF-SUPERVISED VISUAL REPRESENTATION LEARNING FOR MONOCULAR ROAD SEGMENTATION
    Cho, Jaehoon
    Kim, Youngjung
    Jung, Hyungjoo
    Oh, Changjae
    Youn, Jaesung
    Sohn, Kwanghoon
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [22] scPretrain: multi-task self-supervised learning for cell-type classification
    Zhang, Ruiyi
    Luo, Yunan
    Ma, Jianzhu
    Zhang, Ming
    Wang, Sheng
    BIOINFORMATICS, 2022, 38 (06) : 1607 - 1614
  • [23] Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection
    Wang, Shaolei
    Wang, Zhongyuan
    Che, Wanxiang
    Liu, Ting
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1813 - 1822
  • [24] Multi-task self-supervised learning based fusion representation for Multi-view clustering
    Guo, Tianlong
    Shen, Derong
    Kou, Yue
    Nie, Tiezheng
    INFORMATION SCIENCES, 2025, 694
  • [25] Multi-Task Collaborative Network: Bridge the Supervised and Self-Supervised Learning for EEG Classification in RSVP Tasks
    Li, Hongxin
    Tang, Jingsheng
    Li, Wenqi
    Dai, Wei
    Liu, Yaru
    Zhou, Zongtan
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2024, 32 : 638 - 651
  • [26] TacoPrompt: A Collaborative Multi-Task Prompt Learning Method for Self-Supervised Taxonomy Completion
    Xu, Hongyuan
    Liu, Ciyi
    Niu, Yuhang
    Chen, Yunong
    Cai, Xiangrui
    Wen, Yanlong
    Yuan, Xiaojie
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15804 - 15817
  • [27] MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations
    Heggan, Calum
    Hospedales, Tim
    Budgett, Sam
    Yaghoobi, Mehrdad
    INTERSPEECH 2023, 2023, : 4399 - 4403
  • [28] Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations
    Lee, Wonhee
    Na, Joonil
    Kim, Gunhee
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4979 - 4988
  • [29] Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis
    Yu, Wenmeng
    Xu, Hua
    Yuan, Ziqi
    Wu, Jiele
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10790 - 10797
  • [30] Self-Supervised Multi-Task Pretraining Improves Image Aesthetic Assessment
    Pfister, Jan
    Kobs, Konstantin
    Hotho, Andreas
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 816 - 825