Optimizing Machine Learning Data Pre-Processing for Financial Fraud Detection

被引:0
作者
Bower, Matthew [1 ]
Godasu, Rajesh [1 ]
Nyakundi, Nicholas [1 ]
Reynolds, Shawn [1 ]
机构
[1] Univ North Dakota, Sch Elect Engn & Comp Sci, Grand Forks, ND 58202 USA
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY, EIT 2024 | 2024年
关键词
Machine Learning; Financial Fraud Detection; Preprocessing; Data Balancing; Resampling; SMOTE; ADASYN; RandomOverSampler; K-Nearest Neighbours (KNN); Support Vector Machine (SVM); and Random Forest (RF);
D O I
10.1109/eIT60633.2024.10609910
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Financial fraud is an expensive issue that can be difficult to detect and machine learning (ML) algorithms are frequently applied. Most datasets for financial fraud are imbalanced with few fraudulent and many non-fraudulent transactions. Our research focuses how data-balancing techniques effect the accuracy of three common ML models when applied to financial fraud detection. We experimented with Resample, SMOTE, ADASYN, and RandomOverSampler for data balancing then apply K-Nearest Neighbours (KNN), Support Vector Machine (SVM), and Random Forest (RF) algorithms to see how well they perform in the detection of financial fraud after balancing.
引用
收藏
页码:28 / 37
页数:10
相关论文
共 31 条
[1]  
Abbasi A, 2012, MIS QUART, V36, P1293
[2]   Financial Fraud Detection Based on Machine Learning: A Systematic Literature Review [J].
Ali, Abdulalem ;
Abd Razak, Shukor ;
Othman, Siti Hajar ;
Eisa, Taiseer Abdalla Elfadil ;
Al-Dhaqm, Arafat ;
Nasser, Maged ;
Elhassan, Tusneem ;
Elshafie, Hashim ;
Saif, Abdu .
APPLIED SCIENCES-BASEL, 2022, 12 (19)
[3]   Recent Advances in Cybersecurity and Fraud Detection in Financial Services: A Survey [J].
Bajracharya, Aakriti ;
Harvey, Barron ;
Rawat, Danda B. .
2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC, 2023, :368-374
[4]   Data mining for credit card fraud: A comparative study [J].
Bhattacharyya, Siddhartha ;
Jha, Sanjeev ;
Tharakunnel, Kurian ;
Westland, J. Christopher .
DECISION SUPPORT SYSTEMS, 2011, 50 (03) :602-613
[5]  
Chakraborty A. K., 2021, Financial payment services fraud data.
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]   Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy [J].
Dal Pozzolo, Andrea ;
Boracchi, Giacomo ;
Caelen, Olivier ;
Alippi, Cesare ;
Bontempi, Gianluca .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (08) :3784-3797
[8]  
Dal Pozzolo A, 2015, IEEE IJCNN
[9]   Learned lessons in credit card fraud detection from a practitioner perspective [J].
Dal Pozzolo, Andrea ;
Caelen, Olivier ;
Le Borgne, Yann-Ael ;
Waterschoot, Serge ;
Bontempi, Gianluca .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (10) :4915-4928
[10]   ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning [J].
He, Haibo ;
Bai, Yang ;
Garcia, Edwardo A. ;
Li, Shutao .
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, :1322-1328