Leveraging Prompt and Top-K Predictions with ChatGPT Data Augmentation for Improved Relation Extraction

被引:1
|
作者
Feng, Ping [1 ,2 ,3 ,4 ,5 ]
Wu, Hang [6 ]
Yang, Ziqian [6 ]
Wang, Yunyi [6 ]
Ouyang, Dantong [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Changchun Univ, Coll Comp Sci & Technol, Changchun 130022, Peoples R China
[3] Key Lab Intelligent Rehabil & Barrier Free Access, Minist Educ, Changchun 130022, Peoples R China
[4] Jilin Prov Key Lab Human Hlth State Identificat &, Changchun 130022, Peoples R China
[5] Jilin Rehabil Equipment & Technol Engn Res Ctr Dis, Changchun 130022, Peoples R China
[6] Changchun Univ, Coll Cybersecur, Changchun 130022, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 23期
关键词
relation extraction; language model; prompt information; deep learning models; NLP;
D O I
10.3390/app132312746
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Relation extraction tasks aim to predict the type of relationship between two entities from a given text. However, many existing methods fail to fully utilize the semantic information and the probability distribution of the output of pre-trained language models, and existing data augmentation approaches for natural language processing (NLP) may introduce errors. To address this issue, we propose a method that introduces prompt information and Top-K prediction sets and utilizes ChatGPT for data augmentation to improve relational classification model performance. First, we add prompt information before each sample and encode the modified samples by pre-training the language model RoBERTa and using these feature vectors to obtain the Top-K prediction set. We add a multi-attention mechanism to link the Top-K prediction set with the prompt information. We then reduce the possibility of introducing noise by bootstrapping ChatGPT so that it can better perform the data augmentation task and reduce subsequent unnecessary operations. Finally, we investigate the predefined relationship categories in the SemEval 2010 Task 8 dataset and the prediction results of the model and propose an entity location prediction task designed to assist the model in accurately determining the relative locations between entities. Experimental results indicate that our model achieves high results on the SemEval 2010 Task 8 dataset.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Top-k Algorithm Based on Extraction
    Li, Lingjuan
    Zeng, Xue
    Lu, Guoyu
    PROCEEDINGS OF THE 2011 2ND INTERNATIONAL CONGRESS ON COMPUTER APPLICATIONS AND COMPUTATIONAL SCIENCE, VOL 1, 2012, 144 : 113 - +
  • [2] Top-K Oracle: A New Way to Present Top-K Tuples for Uncertain Data
    Song, Chunyao
    Li, Zheng
    Ge, Tingjian
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 146 - 157
  • [3] Top-k queries on temporal data
    Li, Feifei
    Yi, Ke
    Le, Wangchao
    VLDB JOURNAL, 2010, 19 (05): : 715 - 733
  • [4] Top-k queries on temporal data
    Feifei Li
    Ke Yi
    Wangchao Le
    The VLDB Journal, 2010, 19 : 715 - 733
  • [5] Reviewing Labels: Label Graph Network with Top-k Prediction Set for Relation Extraction
    Li, Bo
    Ye, Wei
    Zhang, Jinglei
    Zhang, Shikun
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13051 - 13058
  • [6] Top-k Entity Augmentation using Consistent Set Covering
    Eberius, Julian
    Thiele, Maik
    Braunschweig, Katrin
    Lehner, Wolfgang
    PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2015,
  • [7] Building Top-k Consistent Results for Web Table Augmentation
    Qi, Fei
    Wu, Xiaoyu
    Wang, Ning
    2017 14TH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE (WISA 2017), 2017, : 74 - 79
  • [8] Leveraging Data Augmentation for Process Information Extraction
    Neuberger, Julian
    Doll, Leonie
    Engelmann, Benedikt
    Ackermann, Lars
    Jablonski, Stefan
    ENTERPRISE, BUSINESS-PROCESS AND INFORMATION SYSTEMS MODELING, BPMDS 2024, EMMSAD 2024, 2024, 511 : 57 - 70
  • [9] Automatic Extraction of Top-k Lists from the Web
    Zhang, Zhixian
    Zhu, Kenny Q.
    Wang, Haixun
    Li, Hongsong
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 1057 - 1068
  • [10] Top-k Dominating Queries on Incomplete Data
    Miao, Xiaoye
    Gao, Yunjun
    Zheng, Baihua
    Chen, Gang
    Cui, Huiyong
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1500 - 1501