Granular Syntax Processing with Multi-Task and Curriculum Learning

被引:0
作者
Zhang, Xulang [1 ]
Mao, Rui [1 ]
Cambria, Erik [1 ]
机构
[1] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore, Singapore
关键词
Text chunking; Part-of-speech tagging; Sentence boundary detection; Multi-task learning; Granularity computing; Curriculum learning; SENTIMENT ANALYSIS; MECHANISM; NETWORK; MODELS;
D O I
10.1007/s12559-024-10320-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Syntactic processing techniques are the foundation of natural language processing (NLP), supporting many downstream NLP tasks. In this paper, we conduct pair-wise multi-task learning (MTL) on syntactic tasks with different granularity, namely Sentence Boundary Detection (SBD), text chunking, and Part-of-Speech (PoS) tagging, so as to investigate the extent to which they complement each other. We propose a novel soft parameter-sharing mechanism to share local and global dependency information that is learned from both target tasks. We also propose a curriculum learning (CL) mechanism to improve MTL with non-parallel labeled data. Using non-parallel labeled data in MTL is a common practice, whereas it has not received enough attention before. For example, our employed PoS tagging data do not have text chunking labels. When learning PoS tagging and text chunking together, the proposed CL mechanism aims to select complementary samples from the two tasks to update the parameters of the MTL model in the same training batch. Such a method yields better performance and learning stability. We conclude that the fine-grained tasks can provide complementary features to coarse-grained ones, while the most coarse-grained task, SBD, provides useful information for the most fine-grained one, PoS tagging. Additionally, the text chunking task achieves state-of-the-art performance when joint learning with PoS tagging. Our analytical experiments also show the effectiveness of the proposed soft parameter-sharing and CL mechanisms.
引用
收藏
页码:3020 / 3034
页数:15
相关论文
共 95 条
[71]  
Stamatatos E., 1999, Proceedings of the Workshop on Machine Learning in Human Language Technology, P88
[72]  
Sun X., 2008, Proceedings of the 22nd International Conference on Computational Linguistics-Volume, V1, P841
[73]   Hybrid neural conditional random fields for multi-view sequence labeling [J].
Sun, Xuli ;
Sun, Shiliang ;
Yin, Minzhi ;
Yang, Hao .
KNOWLEDGE-BASED SYSTEMS, 2020, 189
[74]  
Sutton C, 2007, J MACH LEARN RES, V8, P693
[75]   Associating targets with SentiUnits: a step forward in sentiment analysis of Urdu text [J].
Syed, Afraz Z. ;
Aslam, Muhammad ;
Maria Martinez-Enriquez, Ana .
ARTIFICIAL INTELLIGENCE REVIEW, 2014, 41 (04) :535-561
[76]   Bidirectional Recurrent Neural Network with Attention Mechanism for Punctuation Restoration [J].
Tilk, Ottokar ;
Alumae, Tanel .
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :3047-3051
[77]  
Tjong Kim Sang E.F., 2000, P CONLL 2000 LLL 200, V7, P127, DOI 10.3115/1117601.1117631
[78]  
Vaswani A, 2017, ADV NEUR IN, V30
[79]  
Wang W., 2019, PREPRINT
[80]  
Wang Xiaomei., 2020, PREPRINT, DOI [DOI 10.20944/PREPRINTS202003.0360.V2, 10.20944/PREPRINTS202003.0360.V2]