Few-shot relation classification based on the BERT model, hybrid attention and fusion networks

被引:2
作者
Li, Yibing [1 ,2 ,3 ]
Ding, Zenghui [1 ]
Ma, Zuchang [1 ]
Wu, Yichen [1 ,2 ]
Wang, Yu [1 ,2 ]
Zhang, Ruiqi [1 ,2 ]
Xie, Fei [3 ]
Ren, Xiaoye [3 ]
机构
[1] Chinese Acad Sci, Inst Intelligent Machines, Inst Phys Sci, Hefei 230031, Peoples R China
[2] Univ Sci & Technol China, Sci Isl Branch, Grad Sch, Hefei 230026, Peoples R China
[3] Hefei Normal Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Relation classification; Few-shot learning; BERT; Attention; Rapidity of convergence; SUPERVISION;
D O I
10.1007/s10489-023-04634-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Relation classification (RC) is an essential task in information extraction. The distance supervision (DS) method can use many unlabeled data and solve the lack of training data on the RC task. However, the DS method has the problems of long tails and noise. Intuitively, people can solve these problems using few-shot learning (FSL). Our work aims to improve the accuracy and rapidity of convergence on the few-shot RC task. We believe that entity pairs have an essential role in the few-shot RC task. We propose a new context encoder, which is improved based on the bidirectional encoder representations from transformers (BERT) model to fuse entity pairs and their dependence information in instances. At the same time, we design hybrid attention, which includes support instance-level and query instance-level attention. The support instance level dynamically assigns the weight of each instance in the support set. It makes up for the insufficiency of prototypical networks, which distribute weights to sentences equally. Query instance-level attention is dynamically assigned weights to query instances by similarity with the prototype. The ablation study shows the effectiveness of our proposed method. In addition, a fusion network is designed to replace the Euclidean distance method of previous works when class matching is performed, improving the convergence's rapidity. This makes our model more suitable for industrial applications. The experimental results show that the proposed model's accuracy is better than that of several other models.
引用
收藏
页码:21448 / 21464
页数:17
相关论文
共 37 条
  • [1] Bao Y., 2020, INT C LEARNING REPRE, DOI [10.1007/978-981-33-4859-2_14, DOI 10.1007/978-981-33-4859-2_14]
  • [2] Distant Supervision for Relation Extraction with Sentence Selection and Interaction Representation
    Chen, Tiantian
    Wang, Nianbin
    Wang, Hongbin
    Zhan, Haomin
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [3] Task-Adaptive Feature Fusion for Generalized Few-Shot Relation Classification in an Open World Environment
    Chen, Xiaofeng
    Wang, Guohua
    Ren, Haopeng
    Cai, Yi
    Leung, Ho-fung
    Wang, Tao
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1003 - 1015
  • [4] A few-shot transfer learning approach using text-label embedding with legal attributes for law article prediction
    Chen, Yuh-Shyan
    Chiang, Shin-Wei
    Wu, Meng-Luen
    [J]. APPLIED INTELLIGENCE, 2022, 52 (03) : 2884 - 2902
  • [5] Interactive Few-Shot Learning: Limited Supervision, Better Medical Image Segmentation
    Feng, Ruiwei
    Zheng, Xiangshang
    Gao, Tianxiang
    Chen, Jintai
    Wang, Wenzhe
    Chen, Danny Z.
    Wu, Jian
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (10) : 2575 - 2588
  • [6] Gao TY, 2020, AAAI CONF ARTIF INTE, V34, P7772
  • [7] Gao TY, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P6250
  • [8] Gao TY, 2019, AAAI CONF ARTIF INTE, P6407
  • [9] Ensemble feature selection using distance-based supervised and unsupervised methods in binary classification
    Hallajian, Bita
    Motameni, Homayun
    Akbari, Ebrahim
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 200
  • [10] Han X, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P4803