One-Class versus Binary Classification: Which and When?

被引:58
作者
Bellinger, Colin [1 ]
Sharma, Shiven [1 ]
Japkowicz, Nathalie [1 ]
机构
[1] Univ Ottawa, SITE, 800 King Edward Ave, Ottawa, ON, Canada
来源
2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2 | 2012年
关键词
Machine learning; one-class classification; binary classification; imbalanced data;
D O I
10.1109/ICMLA.2012.212
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Binary classifiers have typically been the norm for building classification models in the Machine Learning community. However, an alternate to binary classification is one-class classification, which aims to build models using only a single class of data. This is particularly useful when there is an overabundance of data of a particular class. In such imbalanced cases, binary classifiers may not perform very well, and one class classifiers then become the viable option. In this paper, we are interested in investigating the performance of binary and one-class classifiers as the level of imbalance increases, and, thus, uncertainty in the second class. Our objective is to gain insight into which classification paradigm becomes more suitable as imbalance and uncertainty increase. To this end, we conduct experiments on various datasets, both artificial and from the UCI repository, and monitor the performance of the binary and one class classifiers as the size of the second class gradually decreases, thus increasing the level of imbalance. The results show that as the level of imbalance increases, the performance of binary classifiers decreases, whereas one-class classifiers stay relatively stable.
引用
收藏
页码:102 / 106
页数:5
相关论文
共 11 条
[11]   Estimating the support of a high-dimensional distribution [J].
Schölkopf, B ;
Platt, JC ;
Shawe-Taylor, J ;
Smola, AJ ;
Williamson, RC .
NEURAL COMPUTATION, 2001, 13 (07) :1443-1471