Research on Data Augmentation Techniques for Text Classification Based on Antonym Replacement and Random Swapping

被引：1

作者：

Wang, Shaoyan ^{[1
]}

Xiang, Yu ^{[1
]}

机构：

[1] Yunnan Normal Univ, Kunming 650000, Peoples R China

来源：

PROCEEDINGS OF INTERNATIONAL CONFERENCE ON MODELING, NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING, CMNM 2024 | 2024年

关键词：

Data augmentation; Antonym substitution; Random exchange;

D O I：

10.1145/3677779.3677796

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Traditional simple data augmentation techniques have been proven to be effective in enhancing the performance of models. Among these techniques, researchers have explored the use of synonym replacement and random position swapping. However, there has been limited exploration of the augmentation technique involving antonym replacement, and the focus of random swapping has mostly been on the swapping of positions between two words. Additionally, an important challenge in data augmentation is determining which data should be augmented. In this paper, we propose two data augmentation techniques: antonym replacement for data at a moderate difficulty level and random position swapping based on specific positions and proportions. We investigate the impact of these augmentation techniques on the performance of text classification models. Specifically, for the augmented samples obtained through antonym replacement, we propose using similarity and predictive models to assign labels. For random position swapping, we primarily explore the swapping of word positions within sentences and different swapping methods. Through these two augmentation techniques, we expand our limited text data and achieve improved performance on classification tasks.

引用

页码：103 / 108

页数：6

共 12 条

[1]

Alberto O, 2021, arXiv

[2]

Burchell L, 2022, Arxiv, DOI arXiv:2206.00564

[3]

Chen XY, 2023, Arxiv, DOI arXiv:2303.09719

[4]

Devlin J, 2019, Arxiv, DOI arXiv:1810.04805

[5]

Ji YJ, 2023, Arxiv, DOI [arXiv:2303.07610, 10.48550/arXiv.2303.07610, DOI 10.48550/ARXIV.2303.07610]

[6] How to Fine-Tune BERT for Text Classification? [J].

Sun, Chi ;

Qiu, Xipeng ;

Xu, Yige ;

Huang, Xuanjing .

CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 :194-206

[7]

Touvron H, 2023, Arxiv, DOI [arXiv:2302.13971, 10.48550/arXiv.2302.13971]

[8]

Trichopoulos Georgios, 2023, Crafting a Museum Guide Using GPT4

[9]

Wang CC, 2023, Arxiv, DOI arXiv:2302.12784

[10]

Wei J., 2019, arXiv

← 1 2 →