FLEX-SMOTE: Synthetic over-sampling technique that flexibly adjusts to different minority class distributions

被引:10
作者
Bunkhumpornpat, Chumphol [1 ,2 ]
Boonchieng, Ekkarat [1 ,2 ]
Chouvatut, Varin [1 ,2 ]
Lipsky, David [1 ,2 ]
机构
[1] Chiang Mai Univ, Fac Sci, Dept Comp Sci, Chiang Mai 50200, Thailand
[2] Chiang Mai Univ, Ctr Excellence Community Hlth Informat, Chiang Mai 50200, Thailand
来源
PATTERNS | 2024年 / 5卷 / 11期
关键词
Highlights;
D O I
10.1016/j.patter.2024.101073
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
THE BIGGER PICTURE Machine learning methods often encounter imbalanced classification problems, particularly when dealing with binary classification. This occurs when the distribution of classes in the training dataset is uneven, which can lead to bias in the trained model. Fraud detection, claim prediction, default prediction, spam filtering, disease screening, churn prediction, anomaly detection, and outlier identification tasks are a few examples of imbalanced classification issues. To enhance the performance and ensure the accuracy of a model, it is crucial to address the issue of class imbalance. Predictive modeling is complicated by imbalanced datasets, but this is to be expected, as the real world consists of biased cases. By preventing the dataset from becoming biased toward one class, balancing it makes it easier to train a model. To put it another way, the model will not continue to favor the majority class solely based on having more data.
引用
收藏
页数:12
相关论文
共 36 条
[1]  
archive, The UC Irvine Machine Learning Repository
[2]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[3]  
BUCKLAND M, 1994, J AM SOC INFORM SCI, V45, P12, DOI 10.1002/(SICI)1097-4571(199401)45:1<12::AID-ASI2>3.0.CO
[4]  
2-L
[5]  
Bunkhumpornpat C, Figshare
[6]  
Bunkhumpornpat C., 2011, 8 INT C INF COMM SIG
[7]   CORE: core-based synthetic minority over-sampling and borderline majority under-sampling technique [J].
Bunkhumpornpat, Chumphol ;
Sinapiromsaran, Krung .
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 12 (01) :44-58
[8]  
Bunkhumpornpat C, 2014, CHIANG MAI J SCI, V41, P1419
[9]  
Bunkhumpornpat C, 2013, 2013 13TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT): COMMUNICATION AND INFORMATION TECHNOLOGY FOR NEW LIFE STYLE BEYOND THE CLOUD, P570, DOI 10.1109/ISCIT.2013.6645923
[10]   DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique [J].
Bunkhumpornpat, Chumphol ;
Sinapiromsaran, Krung ;
Lursinsap, Chidchanok .
APPLIED INTELLIGENCE, 2012, 36 (03) :664-684