Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks

被引:3
|
作者
Tang, Huidong [1 ]
Kamei, Sayaka [1 ]
Morimoto, Yasuhiko [1 ]
机构
[1] Hiroshima Univ, Grad Sch Adv Sci & Engn, Kagamiyama 1-7-1, Higashihiroshima 7398521, Japan
关键词
artificial intelligence; natural language processing; text classification; data augmentation; robustness improvement;
D O I
10.3390/a16010059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text classification is widely studied in natural language processing (NLP). Deep learning models, including large pre-trained models like BERT and DistilBERT, have achieved impressive results in text classification tasks. However, these models' robustness against adversarial attacks remains an area of concern. To address this concern, we propose three data augmentation methods to improve the robustness of such pre-trained models. We evaluated our methods on four text classification datasets by fine-tuning DistilBERT on the augmented datasets and exposing the resulting models to adversarial attacks to evaluate their robustness. In addition to enhancing the robustness, our proposed methods can improve the accuracy and F1-score on three datasets. We also conducted comparison experiments with two existing data augmentation methods. We found that one of our proposed methods demonstrates a similar improvement in terms of performance, but all demonstrate a superior robustness improvement.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Enhancing Text Classification Models with Generative AI-aided Data Augmentation
    Zhao, Huanhuan
    Chen, Haihua
    Yoon, Hong-Jun
    2023 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2023, : 138 - 145
  • [2] Iterative Translation-Based Data Augmentation Method for Text Classification Tasks
    Lee, Sangwon
    Liu, Ling
    Choi, Wonik
    IEEE ACCESS, 2021, 9 : 160437 - 160445
  • [3] Hybrid Model of Data Augmentation Methods for Text Classification Task
    Feng, Jia Hui
    Mohaghegh, Mahsa
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KMIS), VOL 3, 2021, : 194 - 197
  • [4] Data augmentation using virtual word insertion techniques in text classification tasks
    Long, Zhigao
    Li, Hong
    Shi, Jiawen
    Ma, Xin
    EXPERT SYSTEMS, 2024, 41 (04)
  • [5] Hierarchical Data Augmentation and the Application in Text Classification
    Yu, Shujuan
    Yang, Jie
    Liu, Danlei
    Li, Runqi
    Zhang, Yun
    Zhao, Shengmei
    IEEE ACCESS, 2019, 7 : 185476 - 185485
  • [6] Data Augmentation with Transformers for Text Classification
    Medardo Tapia-Tellez, Jose
    Jair Escalante, Hugo
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2020, PT II, 2020, 12469 : 247 - 259
  • [7] Improving Text Classification with Large Language Model-Based Data Augmentation
    Zhao, Huanhuan
    Chen, Haihua
    Ruggles, Thomas A.
    Feng, Yunhe
    Singh, Debjani
    Yoon, Hong-Jun
    ELECTRONICS, 2024, 13 (13)
  • [8] GeoNLPlify: A spatial data augmentation enhancing text classification for crisis monitoring
    Decoupes, Remy
    Roche, Mathieu
    Teisseire, Maguelonne
    INTELLIGENT DATA ANALYSIS, 2024, 28 (02) : 507 - 531
  • [9] Text data augmentation and pre-trained Language Model for enhancing text classification of low-resource languages
    Ziyaden, Atabay
    Yelenov, Amir
    Hajiyev, Fuad
    Rustamov, Samir
    Pak, Alexandr
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [10] Data Augmentation With Semantic Enrichment for Deep Learning Invoice Text Classification
    Chi, Wei Wen
    Tang, Tiong Yew
    Salleh, Narishah Mohamed
    Mukred, Muaadh
    Alsalman, Hussain
    Zohaib, Muhammad
    IEEE ACCESS, 2024, 12 : 57326 - 57344