A large-scale dataset for korean document-level relation extraction from encyclopedia texts

被引:0
|
作者
Son, Suhyune [1 ]
Lim, Jungwoo [1 ]
Koo, Seonmin [1 ]
Kim, Jinsung [1 ]
Kim, Younghoon [2 ]
Lim, Youngsik [2 ]
Hyun, Dongseok [2 ]
Lim, Heuiseok [1 ]
机构
[1] Korea Univ, Comp Sci & Engn, 1 5-ka,Anam Dong, Seoul 02841, South Korea
[2] NAVER, 5 Jeongjail ro,Buljeong ro, Seongnam 13561, South Korea
基金
新加坡国家研究基金会;
关键词
Natural Language Processing; Information Extraction; Document-level Relation Extraction; Korean Relation Extraction; ENTITY;
D O I
10.1007/s10489-024-05605-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the studies, we present TREK (Toward Document-Level Relation Extraction in Korean) dataset constructed from Korean encyclopedia documents written by the domain experts. We provide detailed statistical analyses for our large-scale dataset and human evaluation results suggest the assured quality of TREK . Also, we introduce the document-level RE model that considers the named entity-type while considering the Korean language's properties. In the experiments, we demonstrate that our proposed model outperforms the baselines and conduct qualitative analysis.
引用
收藏
页码:8681 / 8701
页数:21
相关论文
共 50 条
  • [41] Refining ChatGPT for Document-Level Relation Extraction: A Multi-dimensional Prompting Approach
    Zhu, Weiran
    Wang, Xinzhi
    Chen, Xue
    Luo, Xiangfeng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 190 - 201
  • [42] A large-scale Chinese patent dataset for information extraction
    Zheng, Qian
    Guo, Kefu
    Xu, Lin
    SYSTEMS SCIENCE & CONTROL ENGINEERING, 2024, 12 (01)
  • [43] Towards Large-Scale Unsupervised Relation Extraction from the Web
    Min, Bonan
    Shi, Shuming
    Grishman, Ralph
    Lin, Chin-Yew
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2012, 8 (03) : 1 - 23
  • [44] Infusing Dependency Syntax Information into a Transformer Model for Document-Level Relation Extraction from Biomedical Literature
    Yang, Ming
    Zhang, Yijia
    Liu, Da
    Du, Wei
    Di, Yide
    Lin, Hongfei
    HEALTH INFORMATION PROCESSING, CHIP 2022, 2023, 1772 : 37 - 52
  • [45] Multi-View Cooperative Learning with Invariant Rationale for Document-Level Relation Extraction
    Lin, Rui
    Fan, Jing
    He, Yinglong
    Yang, Yehui
    Li, Jia
    Guo, Cunhan
    COGNITIVE COMPUTATION, 2024, 16 (06) : 3505 - 3517
  • [46] SaGCN: Structure-Aware Graph Convolution Network for Document-Level Relation Extraction
    Yang, Shuangji
    Zhang, Taolin
    Su, Danning
    Hu, Nan
    Nong, Wei
    He, Xiaofeng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT III, 2021, 12714 : 377 - 389
  • [47] Enhanced graph convolutional network based on node importance for document-level relation extraction
    Sun, Qi
    Zhang, Kun
    Huang, Kun
    Li, Xun
    Zhang, Ting
    Xu, Tiancheng
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (18) : 15429 - 15439
  • [48] Graph neural networks with selective attention and path reasoning for document-level relation extraction
    Hang, Tingting
    Feng, Jun
    Wang, Yunfeng
    Yan, Le
    APPLIED INTELLIGENCE, 2024, 54 (07) : 5353 - 5372
  • [49] Feature-Enhanced Document-Level Relation Extraction in Threat Intelligence with Knowledge Distillation
    Li, Yongfei
    Guo, Yuanbo
    Fang, Chen
    Hu, Yongjin
    Liu, Yingze
    Chen, Qingli
    ELECTRONICS, 2022, 11 (22)
  • [50] An adaptive confidence-based data revision framework for Document-level Relation Extraction
    Jiang, Chao
    Liao, Jinzhi
    Zhao, Xiang
    Zeng, Daojian
    Dai, Jianhua
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)