Few-shot relation classification based on the BERT model, hybrid attention and fusion networks

被引:2
作者
Li, Yibing [1 ,2 ,3 ]
Ding, Zenghui [1 ]
Ma, Zuchang [1 ]
Wu, Yichen [1 ,2 ]
Wang, Yu [1 ,2 ]
Zhang, Ruiqi [1 ,2 ]
Xie, Fei [3 ]
Ren, Xiaoye [3 ]
机构
[1] Chinese Acad Sci, Inst Intelligent Machines, Inst Phys Sci, Hefei 230031, Peoples R China
[2] Univ Sci & Technol China, Sci Isl Branch, Grad Sch, Hefei 230026, Peoples R China
[3] Hefei Normal Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Relation classification; Few-shot learning; BERT; Attention; Rapidity of convergence; SUPERVISION;
D O I
10.1007/s10489-023-04634-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Relation classification (RC) is an essential task in information extraction. The distance supervision (DS) method can use many unlabeled data and solve the lack of training data on the RC task. However, the DS method has the problems of long tails and noise. Intuitively, people can solve these problems using few-shot learning (FSL). Our work aims to improve the accuracy and rapidity of convergence on the few-shot RC task. We believe that entity pairs have an essential role in the few-shot RC task. We propose a new context encoder, which is improved based on the bidirectional encoder representations from transformers (BERT) model to fuse entity pairs and their dependence information in instances. At the same time, we design hybrid attention, which includes support instance-level and query instance-level attention. The support instance level dynamically assigns the weight of each instance in the support set. It makes up for the insufficiency of prototypical networks, which distribute weights to sentences equally. Query instance-level attention is dynamically assigned weights to query instances by similarity with the prototype. The ablation study shows the effectiveness of our proposed method. In addition, a fusion network is designed to replace the Euclidean distance method of previous works when class matching is performed, improving the convergence's rapidity. This makes our model more suitable for industrial applications. The experimental results show that the proposed model's accuracy is better than that of several other models.
引用
收藏
页码:21448 / 21464
页数:17
相关论文
共 37 条
  • [11] Hou YT, 2021, AAAI CONF ARTIF INTE, V35, P13036
  • [12] Multi-Scale Metric Learning for Few-Shot Learning
    Jiang, Wen
    Huang, Kai
    Geng, Jie
    Deng, Xinyang
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (03) : 1091 - 1102
  • [13] Topological analysis of intuitionistic fuzzy distance measures with applications in classification and clustering
    Khan, Mohd Shoaib
    Lohani, Q. M. Danish
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116
  • [14] Human-level concept learning through probabilistic program induction
    Lake, Brenden M.
    Salakhutdinov, Ruslan
    Tenenbaum, Joshua B.
    [J]. SCIENCE, 2015, 350 (6266) : 1332 - 1338
  • [15] Survey and experimental study on metric learning methods
    Li, Dewei
    Tian, Yingjie
    [J]. NEURAL NETWORKS, 2018, 105 : 447 - 462
  • [16] Relation Classification via Keyword-Attentive Sentence Mechanism and Synthetic Stimulation Loss
    Li, Luoqin
    Wang, Jiabing
    Li, Jichang
    Ma, Qianli
    Wei, Jia
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (09) : 1392 - 1404
  • [17] Li WB, 2019, AAAI CONF ARTIF INTE, P8642
  • [18] Relation Classification Via Modeling Augmented Dependency Paths
    Liu, Yang
    Li, Sujian
    Wei, Furu
    Ji, Heng
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (09) : 1589 - 1598
  • [19] Enhanced semantic representation learning for implicit discourse relation classification
    Ma, Yuhao
    Zhu, Jian
    Liu, Jie
    [J]. APPLIED INTELLIGENCE, 2022, 52 (07) : 7700 - 7712
  • [20] Mintz M., 2009, P JOINT C 47 ANN M A, P1003, DOI DOI 10.3115/1690219.1690287