An unbalanced data classification model using hybrid sampling technique for fraud detection

被引:0
|
作者
Padmaja, T. Maruthi [1 ]
Dhulipalla, Narendra [1 ]
Krishna, P. Radha [1 ]
Bapi, Raju S. [2 ]
Laha, A. [1 ]
机构
[1] IDRBT, Hyderabad, Andhra Pradesh, India
[2] Univ Hyderabad, Dept Comp & Informat Sci, Hyderabad, Andhra Pradesh, India
关键词
fraud detection; SMOTE; VDM; hybrid sampling and data mining;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting fraud is a challenging task as fraud coexists with the latest in technology. The problem to detect the fraud is that the dataset is unbalanced where non-fraudulent class heavily dominates the fraudulent class. In this work, we considered the fraud detection problem as unbalanced data classification problem and proposed a model based on hybrid sampling technique, which is a combination of random under-sampling and over-sampling using SMOTE. Here, SMOTE is used to widen the data region corresponding to minority samples and random under-sampling of majority class is used for balancing the class distribution. The value difference metric (VDM) is used as distance measure while doing SMOTE. We conducted the experiments with classifiers namely k-NN, Radial Basis Function networks, C4.5 and Naive Bayes with varied levels of SMOTE on insurance fraud dataset. For evaluating the learned classifiers, we have chosen fraud catching rate, non-fraud catching rate in addition to overall accuracy of the classifier as performance measures. Results indicate that our approach produces high predictions against fraud and non-fraud classes.
引用
收藏
页码:341 / +
页数:2
相关论文
共 50 条
  • [11] SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE BASED ROTATION FOREST FOR THE CLASSIFICATION OF UNBALANCED HYPERSPECTRAL DATA
    Feng, Wei
    Huang, Wenjiang
    Ye, Huichun
    Zhao, Longlong
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 2651 - 2654
  • [12] Online In-Auction Fraud Detection Using Online Hybrid Model
    Gupta, Priyanka
    Mundra, Ankit
    2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA), 2015, : 901 - 907
  • [13] A Financial Statement Fraud Detection Model Based on Hybrid Data Mining Methods
    Yao, Jianrong
    Zhang, Jie
    Wang, Lu
    2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD), 2018, : 57 - 61
  • [14] A Hybrid Model for Fraud Detection on Purchase Orders
    Moreno Oliverio, William Ferreira
    Silva, Allan Barcelos
    Rigo, Sandro Jose
    Bezerra da Costa, Rodolpho Lopes
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2019, PT I, 2019, 11871 : 110 - 120
  • [15] Facial Fraud Discrimination Using Detection and Classification
    Choi, Inho
    Kim, Daijin
    ADVANCES IN VISUAL COMPUTING, PT III, 2010, 6455 : 199 - 208
  • [16] A Comparison of Data Sampling Techniques for Credit Card Fraud Detection
    Muaz, Abdulla
    Jayabalan, Manoj
    Thiruchelvam, Vinesh
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (06) : 477 - 485
  • [17] Unbalanced data classification based on over-sampling and integrated learning
    Zhang, Yongjun
    Jian, Xiaowen
    2021 ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE (ACCTCS 2021), 2021, : 332 - 337
  • [18] Logistic Regression Learning Model for Handling Concept Drift with Unbalanced Data in Credit Card Fraud Detection System
    Kulkarni, Pallavi
    Ade, Roshani
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 2, 2016, 380 : 681 - 689
  • [19] A Hybrid Feature Selection Algorithm For Classification Unbalanced Data Processsing
    Zhang, Xue
    Shi, Zhiguo
    Liu, Xuan
    Li, Xueni
    2018 IEEE INTERNATIONAL CONFERENCE ON SMART INTERNET OF THINGS (SMARTIOT 2018), 2018, : 269 - 275
  • [20] Data Sampling Approaches with Severely Imbalanced Big Data for Medicare Fraud Detection
    Bauder, Richard A.
    Khoshgoftaar, Taghi M.
    Hasanin, Tawfiq
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 137 - 142