Enhanced prototypical network for few-shot relation extraction

被引:30
作者
Wen, Wen [1 ]
Liu, Yongbin [1 ]
Ouyang, Chunping [1 ,3 ]
Lin, Qiang [1 ]
Chung, Tonglee [2 ]
机构
[1] Univ South China, Sch Comp, Hengyang, Hunan, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[3] Hunan Prov Base Sci & Technol Innovat Cooperat, Xiangtan, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Few-shot learning; Transformer; Relation extraction;
D O I
10.1016/j.ipm.2021.102596
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most existing methods for relation extraction tasks depend heavily on large-scale annotated data; they cannot learn from existing knowledge and have low generalization ability. It is urgent for us to solve the above problems by further developing few-shot learning methods. Because of the limitations of the most commonly used CNN model which is not good at sequence labeling and capturing long-range dependencies, we proposed a novel model that integrates the transformer model into a prototypical network for more powerful relation-level feature extraction. The transformer connects tokens directly to adapt to long sequence learning without catastrophic forgetting and is able to gain more enhanced semantic information by learning from several representation subspaces in parallel for each word. We evaluate our method on three tasks, including in-domain, cross-domain and cross-sentence tasks. Our method achieves a trade-off between performance and computation and has an approximately 8% improvement in different settings over the state-of-the-art prototypical network. In addition, our experiments also show that our approach is competitive when considering cross-domain transfer and cross-sentence relation extraction in few-shot learning methods.
引用
收藏
页数:17
相关论文
共 51 条
[1]   Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter [J].
Angel Gonzalez, Jose ;
Hurtado, Lluis-F ;
Pla, Ferran .
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (04)
[2]   Unsupervised Domain Adaptation by Domain Invariant Projection [J].
Baktashmotlagh, Mahsa ;
Harandi, Mehrtash T. ;
Lovell, Brian C. ;
Salzmann, Mathieu .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :769-776
[3]  
Chen MX, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P76
[4]   Empirical study on character level neural network classifier for Chinese text [J].
Chung, Tonglee ;
Xu, Bin ;
Liu, Yongbin ;
Ouyang, Chunping ;
Li, Siliang ;
Luo, Lingyun .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 80 :1-7
[5]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6]  
Devos A., 2019, ARXIV190513613
[7]  
Domhan T, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P1799
[8]   Feature-enriched matrix factorization for relation extraction [J].
Duc-Thuan Vo ;
Bagheri, Ebrahim .
INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (03) :424-444
[9]  
Elsken T, 2020, PROC CVPR IEEE, P12362, DOI 10.1109/CVPR42600.2020.01238
[10]  
Finn C, 2017, PR MACH LEARN RES, V70