Enhanced autoencoder-based fraud detection: a novel approach with noise factor encoding and SMOTE

被引:3
作者
Cakir, Mert Yilmaz [1 ]
Sirin, Yahya [1 ]
机构
[1] Istanbul Sabahattin Zaim Univ, Comp Sci & Engn, TR-34303 Istanbul, Turkiye
关键词
Fraud detection; Noise factor encoding; Autoencoder; Variational autoencoder; Contractive autoencoder; SMOTE; CREDIT; DIMENSIONALITY;
D O I
10.1007/s10115-023-02016-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fraud detection is a critical task across various domains, requiring accurate identification of fraudulent activities within vast arrays of transactional data. The significant challenges in effectively detecting fraud stem from the inherent class imbalance between normal and fraudulent instances. To address this issue, we propose a novel approach that combines autoencoder-based noise factor encoding (NFE) with the synthetic minority oversampling technique (SMOTE). Our study evaluates the efficacy of this approach using three datasets with severe class imbalance. We compare three autoencoder variants-autoencoder (AE), variational autoencoder (VAE), and contractive autoencoder (CAE)-enhanced by the NFE technique. This technique involves training autoencoder models on real fraud data with an added noise factor during the encoding process, followed by combining this altered data with genuine fraud data. Subsequently, SMOTE is employed for oversampling. Through extensive experimentation, we assess various evaluation metrics. Our results demonstrate the superiority of the autoencoder-based NFE approach over the use of traditional oversampling methods like SMOTE alone. Specifically, the AE-NFE method outperforms other techniques in most cases, although the VAE-NFE and CAE-NFE methods also exhibit promising results in specific scenarios. This study highlights the effectiveness of leveraging autoencoder-based NFE and SMOTE for fraud detection. By addressing class imbalance and enhancing the performance of fraud detection models, our approach enables more accurate identification and prevention of fraudulent activities in real-world applications.
引用
收藏
页码:635 / 652
页数:18
相关论文
共 50 条
[21]   Transfer-AE: A novel autoencoder-based impact detection model for structural digital twin [J].
Han, Chengjia ;
Wang, Zixin ;
Fu, Yuguang ;
Dyke, Shirley ;
Shahriar, Adnan .
APPLIED SOFT COMPUTING, 2024, 166
[22]   DeepStream: Autoencoder-based stream temporal clustering and anomaly detection [J].
Harush, Shimon ;
Meidan, Yair ;
Shabtai, Asaf .
COMPUTERS & SECURITY, 2021, 106
[23]   A residual autoencoder-based transformer for fault detection of multivariate processes [J].
Shang, Jilin ;
Yu, Jianbo .
APPLIED SOFT COMPUTING, 2024, 163
[24]   Development of deep autoencoder-based anomaly detection system for HANARO [J].
Ryu, Seunghyoung ;
Jeon, Byoungil ;
Seo, Hogeon ;
Lee, Minwoo ;
Shin, Jin-Won ;
Yu, Yonggyun .
NUCLEAR ENGINEERING AND TECHNOLOGY, 2023, 55 (02) :475-483
[25]   Autoencoder-based deep metric learning for network intrusion detection [J].
Andresini, Giuseppina ;
Appice, Annalisa ;
Malerba, Donato .
INFORMATION SCIENCES, 2021, 569 (569) :706-727
[26]   Convolutional Autoencoder-Based Flaw Detection for Steel Wire Ropes [J].
Zhang, Guoyong ;
Tang, Zhaohui ;
Zhang, Jin ;
Gui, Weihua .
SENSORS, 2020, 20 (22) :1-12
[27]   A semisupervised autoencoder-based method for anomaly detection in cutting tools [J].
Sun, Shixu ;
Liu, Yingchao ;
Hu, Xiaofeng ;
Zhang, Wenjuan .
JOURNAL OF MANUFACTURING PROCESSES, 2023, 93 :315-327
[28]   Autoencoder-based outlier detection for sparse, high dimensional data [J].
Chen, Wanghu ;
Li, Huijun ;
Li, Jing ;
Arshad, Ali .
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, :2735-2742
[29]   Autoencoder-Based Eggshell Crack Detection Using Acoustic Signal [J].
Yabanova, Ismail ;
Balci, Zekeriya ;
Yumurtaci, Mehmet ;
Unler, Tarik .
JOURNAL OF FOOD PROCESS ENGINEERING, 2024, 47 (11)
[30]   A convolutional autoencoder-based approach with batch normalization for energy disaggregation [J].
Huan Chen ;
Yue-Hsien Wang ;
Chun-Hung Fan .
The Journal of Supercomputing, 2021, 77 :2961-2978