Support Vector Data Descriptions and k-Means Clustering: One Class?

被引:30
作者
Goernitz, Nico [1 ]
Lima, Luiz Alberto [2 ,3 ]
Mueller, Klaus-Robert [1 ,4 ,5 ]
Kloft, Marius [6 ]
Nakajima, Shinichi [1 ]
机构
[1] Berlin Inst Technol, Machine Learning Grp, D-10587 Berlin, Germany
[2] Pontifical Catholic Univ Rio de Janeiro, BR-22543900 Rio De Janeiro, Brazil
[3] Petrobras SA, BR-20031912 Rio De Janeiro, Brazil
[4] Korea Univ, Dept Brain & Cognit Engn, Seoul 136713, South Korea
[5] Max Planck Inst Informat, D-66123 Saarbrucken, Germany
[6] Humboldt Univ, Dept Comp Sci, Machine Learning Grp, D-12489 Berlin, Germany
基金
新加坡国家研究基金会;
关键词
Anomaly detection; clustering; k-means; one-class classification; support vector data description (SVDD); KERNEL; SVMS;
D O I
10.1109/TNNLS.2017.2737941
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present ClusterSVDD, a methodology that unifies support vector data descriptions (SVDDs) and k-means clustering into a single formulation. This allows both methods to benefit from one another, i.e., by adding flexibility using multiple spheres for SVDDs and increasing anomaly resistance and flexibility through kernels to k-means. In particular, our approach leads to a new interpretation of k-means as a regularized mode seeking algorithm. The unifying formulation further allows for deriving new algorithms by transferring knowledge from one-class learning settings to clustering settings and vice versa. As a showcase, we derive a clustering method for structured data based on a one-class learning scenario. Additionally, our formulation can be solved via a particularly simple optimization scheme. We evaluate our approach empirically to highlight some of the proposed benefits on artificially generated data, as well as on real-world problems, and provide a PYTHON software package comprising various implementations of primal and dual SVDD as well as our proposed ClusterSVDD.
引用
收藏
页码:3994 / 4006
页数:13
相关论文
共 54 条
[1]  
[Anonymous], 2013, Outlier Analysis, DOI [DOI 10.1007/978-1-4614-6396-2, 10.1007/978-1-4614-6396-2]
[2]  
[Anonymous], 2004, P 10 ACM SIGKDD INT
[3]   Competitive Repetition Suppression (CoRe) Clustering: A Biologically Inspired Learning Model With Application to Robust Clustering [J].
Bacciu, Davide ;
Starita, Antonina .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (11) :1922-1941
[4]   Support vector clustering [J].
Ben-Hur, A ;
Horn, D ;
Siegelmann, HT ;
Vapnik, V .
JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :125-137
[5]   Distributed optimization and statistical learning via the alternating direction method of multipliers [J].
Boyd S. ;
Parikh N. ;
Chu E. ;
Peleato B. ;
Eckstein J. .
Foundations and Trends in Machine Learning, 2010, 3 (01) :1-122
[6]  
Boyd S, 2004, CONVEX OPTIMIZATION
[7]  
Chang C. -C., 2007, TECH REP
[8]  
Chang W. -C., 2010, TECH REP
[9]   On rival penalization controlled competitive learning for clustering with automatic cluster number selection [J].
Cheung, YM .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (11) :1583-1588
[10]   Robust Clustering Using Outlier-Sparsity Regularization [J].
Forero, Pedro A. ;
Kekatos, Vassilis ;
Giannakis, Georgios B. .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2012, 60 (08) :4163-4177