Quantization/Clustering: when and why does k-means work?

被引:0
作者
Levrard, Clement [1 ]
机构
[1] Univ Paris Diderot, LPMA, 8 Pl Aure Lie Nemours, F-75013 Paris, France
来源
JOURNAL OF THE SFDS | 2018年 / 159卷 / 01期
关键词
k-means; clustering; quantization; separation rate; distortion;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Though mostly used as a clustering algorithm, k-means is originally designed as a quantization algorithm. Namely, it aims at providing a compression of a probability distribution with k points. Building upon Levrard (2015); Tang and Monteleoni (2016a), we try to investigate how and when these two approaches are compatible. Namely, we show that provided the sample distribution satisfies a margin like condition (in the sense of Mammen and Tsybakov, 1999 for supervised learning), both the associated empirical risk minimizer and the output of Lloyd's algorithm provide almost optimal classification in certain cases (in the sense of Azizyan et al., 2013). Besides, we also show that they achieved fast and optimal convergence rates in terms of sample size and compression risk.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 50 条
  • [21] Locally Private k-Means Clustering
    Stemmer, Uri
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [22] APPLICATION OF METAHEURISTICS TO K-MEANS CLUSTERING
    Lisin, A. V.
    Faizullin, R. T.
    COMPUTER OPTICS, 2015, 39 (03) : 406 - 412
  • [23] Unsupervised K-Means Clustering Algorithm
    Sinaga, Kristina P.
    Yang, Miin-Shen
    IEEE ACCESS, 2020, 8 : 80716 - 80727
  • [24] An Enhancement of K-means Clustering Algorithm
    Gu, Jirong
    Zhou, Jieming
    Chen, Xianwei
    2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 237 - 240
  • [25] Selection of K in K-means clustering
    Pham, DT
    Dimov, SS
    Nguyen, CD
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2005, 219 (01) : 103 - 119
  • [26] The incremental online k-means clustering algorithm and its application to color quantization
    Abernathy, Amber
    Celebi, M. Emre
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
  • [27] Balanced K-Means for Clustering
    Malinen, Mikko I.
    Franti, Pasi
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2014, 8621 : 32 - 41
  • [28] Spherical k-Means Clustering
    Hornik, Kurt
    Feinerer, Ingo
    Kober, Martin
    Buchta, Christian
    JOURNAL OF STATISTICAL SOFTWARE, 2012, 50 (10): : 1 - 22
  • [29] Subspace K-means clustering
    Timmerman, Marieke E.
    Ceulemans, Eva
    De Roover, Kim
    Van Leeuwen, Karla
    BEHAVIOR RESEARCH METHODS, 2013, 45 (04) : 1011 - 1023
  • [30] Clustering of Image Data Using K-Means and Fuzzy K-Means
    Rahmani, Md. Khalid Imam
    Pal, Naina
    Arora, Kamiya
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 160 - 163