I-WAS: A Data Augmentation Method with GPT-2 for Simile Detection

被引:3
作者
Chang, Yongzhu [1 ]
Zhang, Rongsheng [1 ]
Pu, Jiashu [1 ]
机构
[1] NetEase Inc, Fuxi AI Lab, Hangzhou, Peoples R China
来源
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2023, PT III | 2023年 / 14189卷
关键词
GPT-2; Simile detection; Data augmentation; Iterative;
D O I
10.1007/978-3-031-41682-8_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Simile detection is a valuable task for many natural language processing (NLP)-based applications, particularly in the field of literature. However, existing research on simile detection often relies on corpora that are limited in size and do not adequately represent the full range of simile forms. To address this issue, we propose a simile data augmentation method based on Word replacement And Sentence completion using the GPT-2 language model. Our iterative process called I-WAS, is designed to improve the quality of the augmented sentences. To better evaluate the performance of our method in real-world applications, we have compiled a corpus containing a more diverse set of simile forms for experimentation. Our experimental results demonstrate the effectiveness of our proposed data augmentation method for simile detection.
引用
收藏
页码:265 / 279
页数:15
相关论文
共 48 条
[1]  
Anaby-Tavor A, 2020, AAAI CONF ARTIF INTE, V34, P7383
[2]  
Chakrabarty Tuhin, 2022, FLUTE FIGURATIVE LAN
[3]  
Chen X, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, P1429
[4]  
Claveau V., 2021, arXiv
[5]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6]  
Ding B, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P6045
[7]  
Edunov S, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P489
[8]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[9]  
Guo JJ, 2018, INT CONF SOFTW ENG, P144, DOI 10.1109/ICSESS.2018.8663961
[10]  
Gupta R, 2019, INT CONF ACOUST SPEE, P7380, DOI [10.1109/ICASSP.2019.8682544, 10.1109/icassp.2019.8682544]