Fast semi-supervised SVM classifiers using a priori metric information

被引:3
作者
Vural, Volkan [1 ]
Fung, Glenn [2 ]
Dy, Jennifer G. [1 ]
Rao, Bharat [2 ]
机构
[1] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
[2] Siemens Med Solut, Comp Aided Diag & Therapy, Malvern, PA USA
基金
美国国家科学基金会;
关键词
semi-supervised learning; SVM; linear programming; unconstrained optimization;
D O I
10.1080/10556780802102750
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper describes a support vector machine-based (SVM) parametric optimization method for semi-supervised classification, called LIAM (for linear hyperplane classifier with a priori metric information). Our method takes advantage of similarity information to leverage the unlabelled data in training SVMs. In addition to the smoothness constraints in existing semi-supervised methods, LIAM incorporates local class similarity constraints, that we empirically show, improved the accuracies in the presence of a few labelled points. We present and discuss a general convex mathematical-programming-based formulation to solve the inductive semi-supervised problem; i.e. our proposed algorithm directly classifies test samples not present when training. This general formulation results in different variants depending on the choice of the norms that are used in the objective function. For example, when using the 1-norm the proposed formulation becomes a linear programming problem that has the advantage of generating sparse solutions depending on a minimal set of the original features (feature selection). On the other hand, one of the proposed formulations results in an unconstrained quadratic problem for which solutions can be obtained by solving a simple system of linear equations, resulting in a fast competitive alternative to state-of-the-art semi-supervised algorithms. Our experiments on public benchmarks indicate that LIAM is at least one order of magnitude faster and at least as or more accurate (in most of the cases) than other state-of-the-art semi-supervised classification methods.
引用
收藏
页码:521 / 532
页数:12
相关论文
共 23 条
[1]  
[Anonymous], P KDD 2001 KNOWL DIS
[2]  
[Anonymous], 2000, LEARNING LABELED UNL
[3]  
BELKIN M, 2004, P WORKSH COMP LEARN
[4]  
Bennett KP, 1999, ADV NEUR IN, V11, P368
[5]  
Blum A., 2001, P 18 INT C MACH LEAR, P19, DOI DOI 10.1184/R1/6606860.V1
[6]  
BRADLEY PS, 1998, MACH LEARN P 15 INT
[7]  
Collobert R., 2006, INT C MACH LEARN, P201, DOI DOI 10.1145/1143844.1143870.
[8]  
CORDUNEANU A, 2004, ADV NEURAL INFORM PR
[9]  
CORDUNEANU A, 2003, P 19 C UNC ART INT
[10]   Semi-supervised support vector machines for unlabeled data classification [J].
Fung, G ;
Mangasarian, OL .
OPTIMIZATION METHODS & SOFTWARE, 2001, 15 (01) :29-44