Research on Data Augmentation Techniques for Text Classification Based on Antonym Replacement and Random Swapping

被引:0
作者
Wang, Shaoyan [1 ]
Xiang, Yu [1 ]
机构
[1] Yunnan Normal Univ, Kunming 650000, Peoples R China
来源
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON MODELING, NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING, CMNM 2024 | 2024年
关键词
Data augmentation; Antonym substitution; Random exchange;
D O I
10.1145/3677779.3677796
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional simple data augmentation techniques have been proven to be effective in enhancing the performance of models. Among these techniques, researchers have explored the use of synonym replacement and random position swapping. However, there has been limited exploration of the augmentation technique involving antonym replacement, and the focus of random swapping has mostly been on the swapping of positions between two words. Additionally, an important challenge in data augmentation is determining which data should be augmented. In this paper, we propose two data augmentation techniques: antonym replacement for data at a moderate difficulty level and random position swapping based on specific positions and proportions. We investigate the impact of these augmentation techniques on the performance of text classification models. Specifically, for the augmented samples obtained through antonym replacement, we propose using similarity and predictive models to assign labels. For random position swapping, we primarily explore the swapping of word positions within sentences and different swapping methods. Through these two augmentation techniques, we expand our limited text data and achieve improved performance on classification tasks.
引用
收藏
页码:103 / 108
页数:6
相关论文
共 12 条
  • [1] Burchell L, 2022, Arxiv, DOI arXiv:2206.00564
  • [2] Chen XY, 2023, Arxiv, DOI arXiv:2303.09719
  • [3] Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, DOI 10.48550/ARXIV.1810.04805]
  • [4] Ji YJ, 2023, Arxiv, DOI arXiv:2303.07610
  • [5] Olmo A., 2021, arXiv
  • [6] How to Fine-Tune BERT for Text Classification?
    Sun, Chi
    Qiu, Xipeng
    Xu, Yige
    Huang, Xuanjing
    [J]. CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 194 - 206
  • [7] Touvron H, 2023, Arxiv, DOI [arXiv:2302.13971, DOI 10.48550/ARXIV.2302.13971]
  • [8] Trichopoulos Georgios, 2023, Crafting a Museum Guide Using GPT4
  • [9] Wang CC, 2023, Arxiv, DOI arXiv:2302.12784
  • [10] Wei J., 2019, arXiv