Enhanced prototypical network for few-shot relation extraction

被引：36

作者：

Wen, Wen ^{[1
]}

Liu, Yongbin ^{[1
]}

Ouyang, Chunping ^{[1
,3
]}

Lin, Qiang ^{[1
]}

Chung, Tonglee ^{[2
]}

机构：

[1] Univ South China, Sch Comp, Hengyang, Hunan, Peoples R China

[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

[3] Hunan Prov Base Sci & Technol Innovat Cooperat, Xiangtan, Hunan, Peoples R China

来源：

INFORMATION PROCESSING & MANAGEMENT | 2021年 / 58卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Few-shot learning; Transformer; Relation extraction;

D O I：

10.1016/j.ipm.2021.102596

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Most existing methods for relation extraction tasks depend heavily on large-scale annotated data; they cannot learn from existing knowledge and have low generalization ability. It is urgent for us to solve the above problems by further developing few-shot learning methods. Because of the limitations of the most commonly used CNN model which is not good at sequence labeling and capturing long-range dependencies, we proposed a novel model that integrates the transformer model into a prototypical network for more powerful relation-level feature extraction. The transformer connects tokens directly to adapt to long sequence learning without catastrophic forgetting and is able to gain more enhanced semantic information by learning from several representation subspaces in parallel for each word. We evaluate our method on three tasks, including in-domain, cross-domain and cross-sentence tasks. Our method achieves a trade-off between performance and computation and has an approximately 8% improvement in different settings over the state-of-the-art prototypical network. In addition, our experiments also show that our approach is competitive when considering cross-domain transfer and cross-sentence relation extraction in few-shot learning methods.

引用

页数：17

共 51 条

[1] Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter [J].

Angel Gonzalez, Jose ;

Hurtado, Lluis-F ;

Pla, Ferran .

INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (04)

[2] Unsupervised Domain Adaptation by Domain Invariant Projection [J].

Baktashmotlagh, Mahsa ;

Harandi, Mehrtash T. ;

Lovell, Brian C. ;

Salzmann, Mathieu .

2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :769-776

[3]

Chen MX, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P76

[4] Empirical study on character level neural network classifier for Chinese text [J].

Chung, Tonglee ;

Xu, Bin ;

Liu, Yongbin ;

Ouyang, Chunping ;

Li, Siliang ;

Luo, Lingyun .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 80 :1-7

[5]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[6]

Devos A., 2019, ARXIV190513613

[7]

Domhan T, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P1799

[8] Feature-enriched matrix factorization for relation extraction [J].

Duc-Thuan Vo ;

Bagheri, Ebrahim .

INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (03) :424-444

[9] Meta-Learning of Neural Architectures for Few-Shot Learning [J].

Elsken, Thomas ;

Staffler, Benedikt ;

Metzen, Jan Hendrik ;

Hutter, Frank .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12362-12372

[10]

Finn C, 2017, PR MACH LEARN RES, V70

← 1 2 3 4 5 6 →