An unbalanced data classification model using hybrid sampling technique for fraud detection

被引:0
|
作者
Padmaja, T. Maruthi [1 ]
Dhulipalla, Narendra [1 ]
Krishna, P. Radha [1 ]
Bapi, Raju S. [2 ]
Laha, A. [1 ]
机构
[1] IDRBT, Hyderabad, Andhra Pradesh, India
[2] Univ Hyderabad, Dept Comp & Informat Sci, Hyderabad, Andhra Pradesh, India
关键词
fraud detection; SMOTE; VDM; hybrid sampling and data mining;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting fraud is a challenging task as fraud coexists with the latest in technology. The problem to detect the fraud is that the dataset is unbalanced where non-fraudulent class heavily dominates the fraudulent class. In this work, we considered the fraud detection problem as unbalanced data classification problem and proposed a model based on hybrid sampling technique, which is a combination of random under-sampling and over-sampling using SMOTE. Here, SMOTE is used to widen the data region corresponding to minority samples and random under-sampling of majority class is used for balancing the class distribution. The value difference metric (VDM) is used as distance measure while doing SMOTE. We conducted the experiments with classifiers namely k-NN, Radial Basis Function networks, C4.5 and Naive Bayes with varied levels of SMOTE on insurance fraud dataset. For evaluating the learned classifiers, we have chosen fraud catching rate, non-fraud catching rate in addition to overall accuracy of the classifier as performance measures. Results indicate that our approach produces high predictions against fraud and non-fraud classes.
引用
收藏
页码:341 / +
页数:2
相关论文
共 50 条
  • [1] Unbalanced data classification using extreme outlier elimination and sampling techniques for fraud detection
    Padmaja, T. Maruthi
    Dhulipalla, Narendra
    Bapi, Raju S.
    Krishna, P. Radha
    ADCOM 2007: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, 2007, : 511 - +
  • [2] An Overview of a Hybrid Fraud Scoring and Spike Detection Technique for Fraud Detection in Streaming Data
    Laleh, Naeimeh
    Azgomi, Mohammad Abdollahi
    INFORMATION SYSTEMS, TECHNOLOGY AND MANAGEMENT-THIRD INTERNATIONAL CONFERENCE, ICISTM 2009, 2009, 31 : 356 - 357
  • [3] Improvement in credit card fraud detection using ensemble classification technique and user data
    Al Rubaie, Evan Madhi Hamzh
    INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 (02): : 1240 - 1265
  • [4] A hybrid fraud scoring and spike detection technique in streaming data
    Laleh, Naeimeh
    Azgomi, Mohammad Abdollahi
    INTELLIGENT DATA ANALYSIS, 2010, 14 (06) : 773 - 800
  • [5] Towards unbalanced multiclass intrusion detection with hybrid sampling methods and ensemble classification
    Le, Thi-Thu-Huong
    Shin, Yeongjae
    Kim, Myeongkil
    Kim, Howon
    APPLIED SOFT COMPUTING, 2024, 157
  • [6] Modeling Insurance Fraud Detection Using Imbalanced Data Classification
    Hassan, Amira Kamil Ibrahim
    Abraham, Ajith
    ADVANCES IN NATURE AND BIOLOGICALLY INSPIRED COMPUTING, 2016, 419 : 117 - 127
  • [7] Credit Card Fraud Detection Using Data Science Technique
    Jayakumar, D.
    Rose, R. Remya
    Kumar, P. Gangula Sudheer
    Vignesh, C. Bhuvan
    Bhupath, A. K.
    INTERNATIONAL JOURNAL OF EARLY CHILDHOOD SPECIAL EDUCATION, 2022, 14 (02) : 7861 - 7866
  • [8] Sampling Methods in Genetic Programming for Classification with Unbalanced Data
    Hunt, Rachel
    Johnston, Mark
    Browne, Will
    Zhang, Mengjie
    AI 2010: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2010, 6464 : 273 - +
  • [9] Detection of Credit Card Fraud using a Hybrid Ensemble Model
    Saraf, Sayali
    Phakatkar, Anupama
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (09) : 464 - 474
  • [10] A Hybrid Knowledge Base System for Fraud Detection Using Accounting Data
    Liu, Ou
    Zhou, Duanning
    AMCIS 2014 PROCEEDINGS, 2014,