Image and Text: Fighting the Same Battle? Super-resolution Learning for Imbalanced Text Classification

被引:0
作者
Meunier, Romain [1 ]
Benamar, Farah [1 ,2 ]
Moriceau, Veronique [1 ]
Stolfl, Patricia [1 ]
机构
[1] Univ Toulouse, Toulouse INP, CNRS, IRIT,UT3, Toulouse, France
[2] CNRS NUS ASTAR, IPAL, Singapore, Singapore
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023) | 2023年
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose SRL4NLP, a new approach for data augmentation by drawing an analogy between image and text processing: Super-resolution learning. This method is based on using high-resolution images to overcome the problem of low resolution images. While this technique is a common usage in image processing when images have a low resolution or are too noisy, it has never been used in NLP. We therefore propose the first adaptation of this method for text classification and evaluate its effectiveness on urgency detection from tweets posted in crisis situations, a very challenging task where messages are scarce and highly imbalanced. We show that this strategy is efficient when compared to competitive state-of-the-art data augmentation techniques on several benchmarks datasets in two languages.
引用
收藏
页码:10707 / 10720
页数:14
相关论文
共 72 条
[1]  
Alam F., 2021, P INT AAAI C WEB SOC, V15, P933, DOI DOI 10.1609/ICWSM.V15I1.18116
[2]  
Algiriyage Nilani, 2021, SOCIAL MEDIA CRISES, P731
[3]  
Anaby-Tavor A, 2020, AAAI CONF ARTIF INTE, V34, P7383
[4]  
[Anonymous], 2019, CoRR, DOI DOI 10.48550/arXiv.1907.11692
[5]  
Ansari Gunjan, 2021, P 18 INT C NAT LANG, P152
[6]  
Bayer Markus, 2021, ARXIV
[7]  
Bourgon Nils, 2022, 19 INT C INF SYST CR
[8]  
Brown TB, 2020, ADV NEUR IN, V33
[9]   A systematic study of the class imbalance problem in convolutional neural networks [J].
Buda, Mateusz ;
Maki, Atsuto ;
Mazurowski, Maciej A. .
NEURAL NETWORKS, 2018, 106 :249-259
[10]  
Caragea Cornelia, 2016, 13 INT C INF SYST CR, P1