A Nonparametric Clustering Algorithm with a Quantile-Based Likelihood Estimator

被引:4
|
作者
Hino, Hideitsu [1 ]
Murata, Noboru [2 ]
机构
[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, Tsukuba, Ibaraki 3058573, Japan
[2] Waseda Univ, Sch Sci & Engn, Shinjuku Ku, Tokyo 1698555, Japan
关键词
D O I
10.1162/NECO_a_00628
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a representative of unsupervised learning and one of the important approaches in exploratory data analysis. By its very nature, clustering without strong assumption on data distribution is desirable. Information-theoretic clustering is a class of clustering methods that optimize information-theoretic quantities such as entropy and mutual information. These quantities can be estimated in a nonparametric manner, and information-theoretic clustering algorithms are capable of capturing various intrinsic data structures. It is also possible to estimate information-theoretic quantities using a data set with sampling weight for each datum. Assuming the data set is sampled from a certain cluster and assigning different sampling weights depending on the clusters, the cluster-conditional information-theoretic quantities are estimated. In this letter, a simple iterative clustering algorithm is proposed based on a nonparametric estimator of the log likelihood for weighted data sets. The clustering algorithm is also derived from the principle of conditional entropy minimization with maximum entropy regularization. The proposed algorithm does not contain a tuning parameter. The algorithm is experimentally shown to be comparable to or outperform conventional nonparametric clustering methods.
引用
收藏
页码:2074 / 2101
页数:28
相关论文
共 50 条
  • [41] A nonparametric maximum likelihood estimator for incomplete renewal data
    McClean, S
    Devine, C
    BIOMETRIKA, 1995, 82 (04) : 791 - 803
  • [42] Bootstrap Inference for Quantile-based Modal Regression
    Zhang, Tao
    Kato, Kengo
    Ruppert, David
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (541) : 122 - 134
  • [43] Semiparametric quantile regression using family of quantile-based asymmetric densities
    Gijbels, Irene
    Karim, Rezaul
    Verhasselt, Anneleen
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 157
  • [44] Bayesian estimation of a quantile-based factor model
    Redivo, Edoardo
    Viroli, Cinzia
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2024, 94 (17) : 3892 - 3932
  • [45] Quantile-based risk sharing with heterogeneous beliefs
    Paul Embrechts
    Haiyan Liu
    Tiantian Mao
    Ruodu Wang
    Mathematical Programming, 2020, 181 : 319 - 347
  • [46] QUANTILE-BASED POLICY OPTIMIZATION FOR REINFORCEMENT LEARNING
    Jiang, Jinyang
    Peng, Yijie
    Hu, Jiaqiao
    2022 WINTER SIMULATION CONFERENCE (WSC), 2022, : 2712 - 2723
  • [47] Quantile-Based Inference for Tempered Stable Distributions
    Fallahgoul, Hasan A.
    Veredas, David
    Fabozzi, Frank J.
    COMPUTATIONAL ECONOMICS, 2019, 53 (01) : 51 - 83
  • [48] Improved design of quantile-based control charts
    Ning, Xianghui
    Wu, Chunjie
    JOURNAL OF INDUSTRIAL AND PRODUCTION ENGINEERING, 2011, 28 (07) : 504 - 511
  • [49] A Generalization of the Quantile-Based Flattened Logistic Distribution
    Chakrabarty T.K.
    Sharma D.
    Annals of Data Science, 2021, 8 (03) : 603 - 627
  • [50] Differential quantile-based sensitivity in discontinuous models
    Pesenti, Silvana M.
    Millossovich, Pietro
    Tsanakas, Andreas
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2025, 322 (02) : 554 - 572