Large Margin Distribution Learning with Cost Interval and Unlabeled Data

被引:34
作者
Zhou, Yu-Hang [1 ]
Zhou, Zhi-Hua [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
基金
美国国家科学基金会;
关键词
Margin distribution; cost interval; semi-supervised learning; SUPPORT VECTOR MACHINES; CLASSIFICATION;
D O I
10.1109/TKDE.2016.2535283
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many real-world applications, different types of misclassification usually suffer from different costs, but the accurate cost is often hard to be determined and usually one can only get an interval-estimation like that one type of mistake is about 5 to 10 times more serious than the other type. On the other hand, there are usually abundant unlabeled data available, leading to great research effort about semi-supervised learning. It is noticeable that cost interval and unlabeled data usually appear simultaneously in practice tasks; however, there is rare study tackling them together. In this paper, we propose the cisLDM approach which is able to handle cost interval and exploit unlabeled data in a principled way. Rather than maximizing the minimum margin like traditional large margin classifiers, cisLDM tries to optimize the margin distribution on both labeled and unlabeled data when minimizing the worst-case total-cost and the mean total-cost simultaneously according to the cost interval. Experiments on a broad range of datasets and cost settings exhibit the impressive performance of cisLDM. In particular, cisLDM is able to reduce 47 percent more total-cost than standard SVM and 27 percent more total-cost than cost-sensitive semi-supervised SVM which assumes the true cost value is known in advance.
引用
收藏
页码:1749 / 1763
页数:15
相关论文
共 44 条
[1]  
[Anonymous], 2001, Learning with Kernels |
[2]  
[Anonymous], 2006, AAAI
[3]  
[Anonymous], 2006, BOOK REV IEEE T NEUR
[4]   ON MULTI-CLASS COST-SENSITIVE LEARNING [J].
Zhou, Zhi-Hua ;
Liu, Xu-Ying .
COMPUTATIONAL INTELLIGENCE, 2010, 26 (03) :232-257
[5]  
[Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
[6]  
[Anonymous], 2000, P 17 INT C MACHINE L
[7]  
[Anonymous], INT JOINT C ART INT
[8]   Semi-supervised learning on Riemannian manifolds [J].
Belkin, M ;
Niyogi, P .
MACHINE LEARNING, 2004, 56 (1-3) :209-239
[9]  
Belkin M, 2006, J MACH LEARN RES, V7, P2399
[10]  
Bennett KP, 1999, ADV NEUR IN, V11, P368