One-Class versus Binary Classification: Which and When?

被引:58
作者
Bellinger, Colin [1 ]
Sharma, Shiven [1 ]
Japkowicz, Nathalie [1 ]
机构
[1] Univ Ottawa, SITE, 800 King Edward Ave, Ottawa, ON, Canada
来源
2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2 | 2012年
关键词
Machine learning; one-class classification; binary classification; imbalanced data;
D O I
10.1109/ICMLA.2012.212
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Binary classifiers have typically been the norm for building classification models in the Machine Learning community. However, an alternate to binary classification is one-class classification, which aims to build models using only a single class of data. This is particularly useful when there is an overabundance of data of a particular class. In such imbalanced cases, binary classifiers may not perform very well, and one class classifiers then become the viable option. In this paper, we are interested in investigating the performance of binary and one-class classifiers as the level of imbalance increases, and, thus, uncertainty in the second class. Our objective is to gain insight into which classification paradigm becomes more suitable as imbalance and uncertainty increase. To this end, we conduct experiments on various datasets, both artificial and from the UCI repository, and monitor the performance of the binary and one class classifiers as the size of the second class gradually decreases, thus increasing the level of imbalance. The results show that as the level of imbalance increases, the performance of binary classifiers decreases, whereas one-class classifiers stay relatively stable.
引用
收藏
页码:102 / 106
页数:5
相关论文
共 11 条
[1]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[2]  
[Anonymous], 1997, P 14 INT C ONMACHINE
[3]  
Bellinger C., 2010, 42 SUMM COMP SIM C S
[4]   ROBUST LOCALLY WEIGHTED REGRESSION AND SMOOTHING SCATTERPLOTS [J].
CLEVELAND, WS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1979, 74 (368) :829-836
[5]  
Hall M., 2009, SIGKDD Explorations, V11, P10, DOI DOI 10.1145/1656274.1656278
[6]  
Hempstalk K, 2008, LECT NOTES ARTIF INT, V5211, P505, DOI 10.1007/978-3-540-87479-9_51
[7]   Supervised versus unsupervised binary-learning by feedforward neural networks [J].
Japkowicz, N .
MACHINE LEARNING, 2001, 42 (1-2) :97-122
[8]   Machine learning for the detection of oil spills in satellite radar images [J].
Kubat, M ;
Holte, RC ;
Matwin, S .
MACHINE LEARNING, 1998, 30 (2-3) :195-215
[9]   One-class SVMs for document classification [J].
Manevitz, LM ;
Yousef, M .
JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :139-154
[10]   A new algorithm for reducing the workload of experts in performing systematic reviews [J].
Matwin, Stan ;
Kouznetsov, Alexandre ;
Inkpen, Diana ;
Frunza, Oana ;
O'Blenis, Peter .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (04) :446-453