Incremental clustering based on Wasserstein distance between histogram models

被引:0
|
作者
Qian, Xiaotong [1 ]
Cabanes, Guenael [2 ]
Rastin, Parisa [2 ]
Guidani, Mohamed Alae [3 ]
Marrakchi, Ghassen [4 ]
Clausel, Marianne [2 ]
Grozavu, Nistor [1 ]
机构
[1] CY Cergy Paris Univ, ETIS, UMR 8051, F-95000 Cergy, France
[2] Univ Lorraine, LORIA, UMR 7503, F-54500 Vandoeuvr Les Nancy, France
[3] Ecole Natl Super Mines, Campus Artem, F-54042 Nancy, France
[4] Univ Sorbonne Paris Nord, LIPN, UMR 7030, F-93430 Villetaneuse, France
关键词
Unsupervised learning; Static and dynamic clustering; Large datasets; Data streams; Sliding windows; Histogram models; Wasserstein distance; STREAMING-DATA; CLASSIFIER; ALGORITHMS;
D O I
10.1016/j.patcog.2025.111414
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we present an innovative clustering framework designed for large datasets and real-time data streams which uses a sliding window and histogram model to address the challenge of memory congestion while reducing computational complexity and improving cluster quality for both static and dynamic clustering. The framework provides a simple way to characterize the probability distribution of cluster distributions through histogram models, regardless of their distribution type. This advantage allows for efficient use with various conventional clustering algorithms. To facilitate effective clustering across windows, we use a statistical measure that allows the comparison and merging of different clusters based on the calculation of the Wasserstein distance between histograms.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Wasserstein Distance-Based Auto-Encoder Tracking
    Xu, Long
    Wei, Ying
    Dong, Chenhe
    Xu, Chuaqiao
    Diao, Zhaofu
    NEURAL PROCESSING LETTERS, 2021, 53 (03) : 2305 - 2329
  • [32] Wasserstein Distance-Based Auto-Encoder Tracking
    Long Xu
    Ying Wei
    Chenhe Dong
    Chuaqiao Xu
    Zhaofu Diao
    Neural Processing Letters, 2021, 53 : 2305 - 2329
  • [33] Sample Out-of-Sample Inference Based on Wasserstein Distance
    Blanchet, Jose
    Kang, Yang
    OPERATIONS RESEARCH, 2021, 69 (03) : 985 - 1013
  • [34] Differential semblance optimisation based on the adaptive quadratic Wasserstein distance
    Yu, Zhennan
    Liu, Yang
    JOURNAL OF GEOPHYSICS AND ENGINEERING, 2021, 18 (05) : 605 - 617
  • [35] Approximation algorithms for 1-Wasserstein distance between persistence diagrams
    Chen, Samantha
    Wang, Yusu
    COMPUTATIONAL GEOMETRY-THEORY AND APPLICATIONS, 2025, 129
  • [36] Central limit theorems for the Wasserstein distance between the empirical and the true distributions
    Del Barrio, E
    Giné, E
    Matrán, C
    ANNALS OF PROBABILITY, 1999, 27 (02): : 1009 - 1071
  • [37] Wasserstein Distance-Based Deep Leakage from Gradients
    Wang, Zifan
    Peng, Changgen
    He, Xing
    Tan, Weijie
    ENTROPY, 2023, 25 (05)
  • [38] A Distance for HMMs Based on Aggregated Wasserstein Metric and State Registration
    Chen, Yukun
    Ye, Jianbo
    Li, Jia
    COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 : 451 - 466
  • [39] Multivariate goodness-of-fit tests based on Wasserstein distance
    Hallin, Marc
    Mordant, Gilles
    Segers, Johan
    ELECTRONIC JOURNAL OF STATISTICS, 2021, 15 (01): : 1328 - 1371
  • [40] Clustering-Based Incremental Web Crawling
    Tan, Qingzhao
    Mitra, Prasenjit
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (04)