Use of Data Augmentation Techniques in Detection of Antisocial Behavior Using Deep Learning Methods

被引:7
|
作者
Maslej-Kresnakova, Viera [1 ]
Sarnovsky, Martin [1 ]
Jackova, Julia [1 ]
机构
[1] Tech Univ Kosice, Fac Elect Engn & Informat, Dept Cybernet & Artificial Intelligence, Kosice 04001, Slovakia
来源
FUTURE INTERNET | 2022年 / 14卷 / 09期
关键词
data augmentation; FDA; deep learning; antisocial behavior; fake news detection; toxic comments;
D O I
10.3390/fi14090260
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The work presented in this paper focuses on the use of data augmentation techniques applied in the domain of the detection of antisocial behavior. Data augmentation is a frequently used approach to overcome issues related to the lack of data or problems related to imbalanced classes. Such techniques are used to generate artificial data samples used to improve the volume of the training set or to balance the target distribution. In the antisocial behavior detection domain, we frequently face both issues, the lack of quality labeled data as well as class imbalance. As the majority of the data in this domain is textual, we must consider augmentation methods suitable for NLP tasks. Easy data augmentation (EDA) represents a group of such methods utilizing simple text transformations to create the new, artificial samples. Our main motivation is to explore EDA techniques' usability on the selected tasks from the antisocial behavior detection domain. We focus on the class imbalance problem and apply EDA techniques to two problems: fake news and toxic comments classification. In both cases, we train the convolutional neural networks classifier and compare its performance on the original and EDA-extended datasets. EDA techniques prove to be very task-dependent, with certain limitations resulting from the data they are applied on. The model's performance on the extended toxic comments dataset did improve only marginally, gaining only 0.01 improvement in the F1 metric when applying only a subset of EDA methods. EDA techniques in this case were not suitable enough to handle texts written in more informal language. On the other hand, on the fake news dataset, the performance was improved more significantly, boosting the F1 score by 0.1. Improvement was most significant in the prediction of the minor class, where F1 improved from 0.67 to 0.86.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Abnormal Behavior Detection in Online Exams Using Deep Learning and Data Augmentation Techniques
    Alkhalisy, Muhanad Abdul
    Abid, Saad Hameed
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2023, 19 (10) : 33 - 48
  • [2] Antisocial online behavior detection using deep learning
    Zinovyeva, Elizaveta
    Hardle, Wolfgang Karl
    Lessmann, Stefan
    DECISION SUPPORT SYSTEMS, 2020, 138
  • [3] Lesion Detection in Breast Tomosynthesis Using Efficient Deep Learning and Data Augmentation Techniques
    Hassan, Loay
    Abdel-Nasser, Mohamed
    Saleh, Adel
    Puig, Domenec
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2021, 339 : 315 - 324
  • [4] Application of Data Augmentation Techniques for Hate Speech Detection with Deep Learning
    Venturott, Ligia Iunes
    Ciarelli, Patrick Marques
    PROGRESS IN ARTIFICIAL INTELLIGENCE (EPIA 2021), 2021, 12981 : 778 - 787
  • [5] Galaxy detection and identification using deep learning and data augmentation
    Gonzalez, R. E.
    Munoz, R. P.
    Hernandez, C. A.
    ASTRONOMY AND COMPUTING, 2018, 25 : 103 - 109
  • [6] UAV Payload Detection Using Deep Learning and Data Augmentation
    Ku, Ilmun
    Roh, Seungyeon
    Kim, Gyeongyeong
    Taylor, Charles
    Wang, Yaqin
    Matson, Eric T.
    2022 SIXTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING, IRC, 2022, : 18 - 25
  • [7] Diabetes detection using deep learning techniques with oversampling and feature augmentation
    Garcia-Ordas, Maria Teresa
    Benavides, Carmen
    Benitez-Andrades, Jose Alberto
    Alaiz-Moreton, Hector
    Garcia-Rodriguez, Isaias
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2021, 202
  • [8] Detection of Diabetes Mellitus With Deep Learning and Data Augmentation Techniques on Foot Thermography
    Anaya-Isaza, Andres
    Zequera-Diaz, Matha
    IEEE ACCESS, 2022, 10 : 59564 - 59591
  • [9] Deep Learning for Topmost Roller Chain Detection Using Data Augmentation
    Wang, Yulin
    Zhou, Yijun
    Luo, Chen
    2019 4TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2019), 2019, : 443 - 446
  • [10] Research on Data Augmentation for Lithography Hotspot Detection Using Deep Learning
    Borisov, Vadim
    Scheible, Juergen
    34TH EUROPEAN MASK AND LITHOGRAPHY CONFERENCE, 2018, 10775