Oversampling Techniques for Detecting Bitcoin Illegal Transactions

被引:0
作者
Han, Jungsu [1 ]
Woo, Jongsoo [2 ]
Hong, Jame Won-Ki [1 ]
机构
[1] POSTECH, Dept Comp Sci & Engn, Pohang Si, Gyeongsangbuk D, South Korea
[2] POSTECH, Ctr Crypto Blockchain Res, Pohang Si, Gyeongsangbuk D, South Korea
来源
APNOMS 2020: 2020 21ST ASIA-PACIFIC NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM (APNOMS) | 2020年
关键词
Blockchain; Imbalanced data; Oversampling; Illegal detection; Classification; SMOTE; GAN;
D O I
10.23919/apnoms50412.2020.9236780
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Bitcoin users are guaranteed to be anonymous, increasing the number of cryptocurrency trading related to crimes and fraudulent activities. While most studies about detecting illegal transactions try to distinguish trading patterns and classify them from legitimate ones, classification performance is poor since the class distributions of transaction data are highly imbalanced. In general, the Synthetic Minority Over-sampling TEchnique (SMOTE) is used to deal with class-imbalanced data, but SMOTE has a problem that it does not fully represent the diversity of the data. In this paper, we introduce another oversampling technique using Generative Adversarial Networks (GAN) to generate artificial training data for classification model. In order to verify similarity between artificial data and the actual one, oversampled dataset is evaluated with a classification model using XGBoost algorithm. We show classification performance is improved on average with synthetic data generated by both SMOTE and well-designed GAN model.
引用
收藏
页码:330 / 333
页数:4
相关论文
共 22 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   MFC-GAN: Class-imbalanced dataset classification using Multiple Fake Class Generative Adversarial Network [J].
Ali-Gombe, Adamu ;
Elyan, Eyad .
NEUROCOMPUTING, 2019, 361 :212-221
[3]  
Ba H., 2019, ARXIV PREPRINT ARXIV
[4]  
Bottou Leon, 2017, WASSERSTEIN GAN
[5]  
Buxton J., 2015, POLICY BRIEF, V7, P1
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[8]   Sex, Drugs, and Bitcoin: How Much Illegal Activity Is Financed through Cryptocurrencies? [J].
Foley, Sean ;
Karlsen, Jonathan R. ;
Putnins, Talis J. .
REVIEW OF FINANCIAL STUDIES, 2019, 32 (05) :1798-1853
[9]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[10]   Learning from class-imbalanced data: Review of methods and applications [J].
Guo Haixiang ;
Li Yijing ;
Shang, Jennifer ;
Gu Mingyun ;
Huang Yuanyue ;
Bing, Gong .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 73 :220-239