Leveraging Prompt and Top-K Predictions with ChatGPT Data Augmentation for Improved Relation Extraction

被引：1

作者：

Feng, Ping ^{[1
,2
,3
,4
,5
]}

Wu, Hang ^{[6
]}

Yang, Ziqian ^{[6
]}

Wang, Yunyi ^{[6
]}

Ouyang, Dantong ^{[1
]}

机构：

[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China

[2] Changchun Univ, Coll Comp Sci & Technol, Changchun 130022, Peoples R China

[3] Key Lab Intelligent Rehabil & Barrier Free Access, Minist Educ, Changchun 130022, Peoples R China

[4] Jilin Prov Key Lab Human Hlth State Identificat &, Changchun 130022, Peoples R China

[5] Jilin Rehabil Equipment & Technol Engn Res Ctr Dis, Changchun 130022, Peoples R China

[6] Changchun Univ, Coll Cybersecur, Changchun 130022, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 23期

关键词：

relation extraction; language model; prompt information; deep learning models; NLP;

D O I：

10.3390/app132312746

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Relation extraction tasks aim to predict the type of relationship between two entities from a given text. However, many existing methods fail to fully utilize the semantic information and the probability distribution of the output of pre-trained language models, and existing data augmentation approaches for natural language processing (NLP) may introduce errors. To address this issue, we propose a method that introduces prompt information and Top-K prediction sets and utilizes ChatGPT for data augmentation to improve relational classification model performance. First, we add prompt information before each sample and encode the modified samples by pre-training the language model RoBERTa and using these feature vectors to obtain the Top-K prediction set. We add a multi-attention mechanism to link the Top-K prediction set with the prompt information. We then reduce the possibility of introducing noise by bootstrapping ChatGPT so that it can better perform the data augmentation task and reduce subsequent unnecessary operations. Finally, we investigate the predefined relationship categories in the SemEval 2010 Task 8 dataset and the prediction results of the model and propose an entity location prediction task designed to assist the model in accurately determining the relative locations between entities. Experimental results indicate that our model achieves high results on the SemEval 2010 Task 8 dataset.

引用

页数：13

共 50 条

[31] On the semantics of top-k ranking for objects with uncertain data
Wang, Chonghai
Yuan, Li Yan
You, Jia-Huai
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2011, 62 (07) : 2812 - 2823
[32] Dynamic structures for top-k queries on uncertain data
Chen, Jiang
Yi, Ke
ALGORITHMS AND COMPUTATION, 2007, 4835 : 427 - +
[33] Top-k Outlier Detection from Uncertain Data
Shaikh, Salman Ahmed
Kitagawa, Hiroyuki
INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2014, 11 (02) : 128 - 142
[34] Top-k query optimization over data services
Malki, Abdelhamid
Benslimane, Sidi-Mohamed
Malki, Mimoun
Barhamgi, Mahmoud
Benslimane, Djamal
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 113 (113): : 1 - 12
[35] Probabilistic Reverse Top-k Query on Probabilistic Data
Trieu Minh Nhut Le
Cao, Jinli
DATABASES THEORY AND APPLICATIONS, ADC 2023, 2024, 14386 : 30 - 43
[36] TopUMS: Top-k Utility Mining in Stream Data
Song, Wei
Fang, Caiyu
Gan, Wensheng
21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 615 - 622
[37] Verifiable top-k searchable encryption for cloud data
B Lydia Elizabeth
A John Prakash
Sādhanā, 2020, 45
[38] Verifiable top-k searchable encryption for cloud data
Elizabeth, B. Lydia
Prakash, A. John
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2019, 45 (01):
[39] Optimizing Distributed Top-k Queries on Uncertain Data
Zhao Zhibin
Yu Yang
Bao Yubin
Yu Ge
2013 25TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2013, : 3209 - 3214
[40] Method for Top-K query on big data in cloud
Ci, X. (cixiang31415926@126.com), 1600, Chinese Academy of Sciences (25):

← 1 2 3 4 5 →