Improving Open Information Extraction with Distant Supervision Learning

被引:3
作者
Han, Jiabao [1 ]
Wang, Hongzhi [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
关键词
Distant supervision learning; Open information extraction; Neural network; Sequence-to-sequence model;
D O I
10.1007/s11063-021-10548-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Open information extraction (Open IE), as one of the essential applications in the area of Natural Language Processing (NLP), has gained great attention in recent years. As a critical technology for building Knowledge Bases (KBs), it converts unstructured natural language sentences into structured representations, usually expressed in the form of triples. Most conventional open information extraction approaches leverage a series of manual pre-defined extraction patterns or learn patterns from labeled training examples, which requires a large number of human resources. Additionally, many Natural Language Processing tools are involved, which leads to error accumulation and propagation. With the rapid development of neural networks, neural-based models can minimize the error propagation problem, but it also faces the problem of data-hungry in supervised learning. Especially, they leverage existing Open IE tools to generate training data, and it causes data quality issues. In this paper, we employ a distant supervision learning approach to improve the Open IE task. We conduct extensive experiments by employing two popular sequence-to-sequence models (RNN and Transformer) and a large benchmark data set to demonstrate the performance of our approach.
引用
收藏
页码:3287 / 3306
页数:20
相关论文
共 49 条
  • [11] Gashteovski K., 2017, EMNLP 2017, P2630, DOI DOI 10.18653/V1/D17-1278
  • [12] A Convolutional Encoder Model for Neural Machine Translation
    Gehring, Jonas
    Auli, Michael
    Grangier, David
    Dauphin, Yann N.
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 123 - 135
  • [13] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
  • [14] Grefenstette E, 2015, 29 ANN C NEURAL INFO, V28
  • [15] Hill F, 2016, P 4INT C LEARN REPR
  • [16] Hoffmann R., 2011, P 49 ANN M ASS COMP
  • [17] Kenter T., 2015, CIKM 15CIKM, P1411, DOI [10.1145/2806416.2806475, DOI 10.1145/2806416.2806475]
  • [18] Lei K, 2018, ARXIV181201889 CORR
  • [19] Convolutional Sequence to Sequence Model for Human Dynamics
    Li, Chen
    Zhang, Zhen
    Lee, Wee Sun
    Lee, Gim Hee
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5226 - 5234
  • [20] Li Gang., 2017, Learning protein protein interaction extraction using BioNLP, V2017, DOI DOI 10.18653/V1/W17-2323