Understanding More Knowledge Makes the Transformer Perform Better in Document-level Relation Extraction

被引:0
作者
Chen, Haotian [1 ]
Chen, Yijiang [1 ]
Zhou, Xiangdong [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
来源
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222 | 2023年 / 222卷
关键词
Document-level relation extraction; graph-based method; a weighted multi-channel Transformer;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Relation extraction plays a vital role in knowledge graph construction. In contrast with the traditional relation extraction on a single sentence, extracting relations from multiple sentences as a whole will harvest more valuable and richer knowledge. Recently, the Transformer-based pre-trained language models (TPLMs) are widely adopted to tackle document-level relation extraction (DocRE). Graph-based methods, aiming to acquire knowledge between entities to form entity-level relation graphs, have facilitated the rapid development of DocRE by infusing their proposed models with the knowledge. However, beyond entity-level knowledge, we discover many other kinds of knowledge that can aid humans to extract relations. It remains unclear whether and in which way they can be adopted to improve the performance of the Transformer, which affects the maximum performance gain of Transformer-based methods. In this paper, we propose a novel weighted multi-channel Transformer (WMCT) to infuse unlimited kinds of knowledge into the vanilla Transformer. Based on WMCT, we also explore five kinds of knowledge to enhance both its reasoning ability and expressive power. Our extensive experimental results demonstrate that: (1) more knowledge makes the performance of the Transformer better and (2) more informative knowledge leads to more performance gain. We appeal to future Transformer-based work to consider exploring more informative knowledge to improve the performance of the Transformer.
引用
收藏
页数:16
相关论文
共 29 条
  • [11] Loshchilov I., 2018, INT C LEARNING REPRE
  • [12] Nan GS, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P1546
  • [13] Paszke A, 2019, ADV NEUR IN, V32
  • [14] Sahu SK, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4309
  • [15] Bidirectional recurrent neural networks
    Schuster, M
    Paliwal, KK
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1997, 45 (11) : 2673 - 2681
  • [16] Soares LB, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2895
  • [17] Velickovic P., 2018, P INT C LEARN REPR, P12
  • [18] Verga Patrick, 2018, P C N AM CHAPT ASS C, P872, DOI DOI 10.18653/V1/N18-1080
  • [19] Wang H, 2019, Arxiv, DOI arXiv:1909.11898
  • [20] Wu Ye, 2019, P 23 ANN INT C RES C, P272, DOI [10.1007/978-3-030-17083-7_17, DOI 10.1007/978-3-030-17083-7_17]