Multi-Task Self-Supervised Learning for Disfluency Detection

被引：0

作者：

Wang, Shaolei ^{[1
]}

Che, Wanxiang ^{[1
]}

Liu, Qi ^{[2
]}

Qin, Pengda ^{[3
]}

Liu, Ting ^{[1
]}

Wang, William Yang ^{[4
]}

机构：

[1] Harbin Inst Technol, Ctr Social Comp & Informat Retrieval, Harbin, Heilongjiang, Peoples R China

[2] Univ Oxford, Oxford, England

[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

[4] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA

来源：

THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2020年 / 34卷

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most existing approaches to disfluency detection heavily rely on human-annotated data, which is expensive to obtain in practice. To tackle the training data bottleneck, we investigate methods for combining multiple self-supervised tasksi.e., supervised tasks where data can be collected without manual labeling. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled news data, and propose two self-supervised pre-training tasks: (i) tagging task to detect the added noisy words. (ii) sentence classification to distinguish original sentences from grammatically-incorrect sentences. We then combine these two tasks to jointly train a network. The pre-trained network is then fine-tuned using human-annotated disfluency detection training data. Experimental results on the commonly used English Switchboard test set show that our approach can achieve competitive performance compared to the previous systems (trained using the full dataset) by using less than 1% (1000 sentences) of the training data. Our method trained on the full dataset significantly outperforms previous methods, reducing the error by 21% on English Switchboard.

引用

页码：9193 / 9200

页数：8

共 50 条

[31] ProteinGLUE multi-task benchmark suite for self-supervised protein modeling
Henriette Capel
Robin Weiler
Maurits Dijkstra
Reinier Vleugels
Peter Bloem
K. Anton Feenstra
Scientific Reports, 12
[32] GMSS: Graph-Based Multi-Task Self-Supervised Learning for EEG Emotion Recognition
Li, Yang
Chen, Ji
Li, Fu
Fu, Boxun
Wu, Hao
Ji, Youshuo
Zhou, Yijin
Niu, Yi
Shi, Guangming
Zheng, Wenming
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 2512 - 2525
[33] ProteinGLUE multi-task benchmark suite for self-supervised protein modeling
Capel, Henriette
Weiler, Robin
Dijkstra, Maurits
Vleugels, Reinier
Bloem, Peter
Feenstra, K. Anton
SCIENTIFIC REPORTS, 2022, 12 (01)
[34] Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery
Ren, Zhongzheng
Lee, Yong Jae
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 762 - 771
[35] MULTI-TASK SELF-SUPERVISED PRE-TRAINING FOR MUSIC CLASSIFICATION
Wu, Ho-Hsiang
Kao, Chieh-Chi
Tang, Qingming
Sun, Ming
McFee, Brian
Bello, Juan Pablo
Wang, Chao
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 556 - 560
[36] Self-supervised Adversarial Multi-task Learning for Vocoder-based Monaural Speech Enhancement
Du, Zhihao
Lei, Ming
Han, Jiqing
Zhang, Shiliang
INTERSPEECH 2020, 2020, : 3271 - 3275
[37] MSLSNet: A combination of multi-task self-supervised learning and Swin transformer network for face and keypoint detection in thermal images
Aghaomidi, Poorya
Mirzaei, Fatemeh
Bahmani, Zahra
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 268
[38] Pano-SfMLearner: Self-Supervised Multi-Task Learning of Depth and Semantics in Panoramic Videos
Liu, Mengyi
Wang, Shuhui
Guo, Yulan
He, Yuan
Xue, Hui
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 832 - 836
[39] Semi-supervised Hotspot Detection with Self-paced Multi-Task Learning
Chen, Ying
Lin, Yibo
Gai, Tianyang
Su, Yajuan
Wei, Yayi
Pan, David Z.
24TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC 2019), 2019, : 420 - 425
[40] Portfolio management using online reinforcement learning with adaptive exploration and Multi-task self-supervised representation
Sang, Chuan-Yun
Huang, Szu-Hao
Chen, Chiao-Ting
Chang, Heng-Ta
APPLIED SOFT COMPUTING, 2025, 172

← 1 2 3 4 5 →