Interpretable Clustering via Soft Clustering Trees

被引:0
作者
Cohen, Eldan [1 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
来源
INTEGRATION OF CONSTRAINT PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND OPERATIONS RESEARCH, CPAIOR 2023 | 2023年 / 13884卷
关键词
DECISION TREE;
D O I
10.1007/978-3-031-33271-5_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a popular unsupervised learning task that consists of finding a partition of the data points that groups similar points together. Despite its popularity, most state-of-the-art algorithms do not provide any explanation of the obtained partition, making it hard to interpret. In recent years, several works have considered using decision trees to construct clusters that are inherently interpretable. However, these approaches do not scale to large datasets, do not account for uncertainty in results, and do not support advanced clustering objectives such as spectral clustering. In this work, we present soft clustering trees, an interpretable clustering approach that is based on soft decision trees that provide probabilistic cluster membership. We model soft clustering trees as continuous optimization problem that is amenable to efficient optimization techniques. Our approach is designed to output highly sparse decision trees to increase interpretability and to support tree-based spectral clustering. Extensive experiments show that our approach can produce clustering trees of significantly higher quality compared to the state-of-the-art and scale to large datasets.
引用
收藏
页码:281 / 298
页数:18
相关论文
共 50 条
  • [41] Bad Data Detection Algorithm for PMU Based on Spectral Clustering
    Yang, Zhiwei
    Liu, Hao
    Bi, Tianshu
    Yang, Qixun
    JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, 2020, 8 (03) : 473 - 483
  • [42] Recursive decision tree induction based on homogeneousness for data clustering
    Varghese, Bindiya M.
    Unnikrishnan, A.
    PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON CYBERWORLDS, 2008, : 754 - +
  • [43] Hybrid soft computing approach based on clustering, rule mining, and decision tree analysis for customer segmentation problem: Real case of customer-centric industries
    Khalili-Damghani, Kaveh
    Abdi, Farshid
    Abolmakarem, Shaghayegh
    APPLIED SOFT COMPUTING, 2018, 73 : 816 - 828
  • [44] K-means tree: an optimal clustering tree for unsupervised learning
    Tavallali, Pooya
    Tavallali, Peyman
    Singhal, Mukesh
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (05) : 5239 - 5266
  • [45] K-means tree: an optimal clustering tree for unsupervised learning
    Pooya Tavallali
    Peyman Tavallali
    Mukesh Singhal
    The Journal of Supercomputing, 2021, 77 : 5239 - 5266
  • [46] Landslide susceptibility modelling based on AHC-OLID clustering algorithm
    Mao, Yimin
    Mwakapesa, Deborah S.
    Wang, Genglong
    Nanehkaran, Y. A.
    Zhang, Maosheng
    ADVANCES IN SPACE RESEARCH, 2021, 68 (01) : 301 - 316
  • [47] Fuzzy Clustering Decision Tree for Classifying Working Wafers of Ion Implanter
    Horng, Shih-Cheng
    Hsiao, Yu-Liang
    2009 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1-4, 2009, : 703 - 707
  • [48] Hierarchical fuzzy clustering decision tree for classifying recipes of ion implanter
    Horng, Shih-Cheng
    Yang, Feng-Yi
    Lin, Shieh-Shing
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (01) : 933 - 940
  • [49] Student Clustering Based on Learning Behavior Data in the Intelligent Tutoring System
    Saric-Grgic, Ines
    Grubisic, Ani
    Seric, Ljiljana
    Robinson, Timothy J.
    INTERNATIONAL JOURNAL OF DISTANCE EDUCATION TECHNOLOGIES, 2020, 18 (02) : 73 - 89
  • [50] Analysis of clustering methods for crop type mapping using satellite imagery
    Rivera, Antonio J.
    Perez-Godoy, Maria D.
    Elizondo, David
    Deka, Lipika
    del Jesus, Maria J.
    NEUROCOMPUTING, 2022, 492 : 91 - 106