Representative Clustering of Uncertain Data

被引:21
|
作者
Zuefle, Andreas [1 ]
Emrich, Tobias [1 ]
Schmid, Klaus Arthur [1 ]
Mamoulis, Nikos [2 ]
Zimkek, Arthur [1 ]
Renz, Matthias [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Inst Informat, Munich, Germany
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
关键词
CONFIDENCE; VARIANCE;
D O I
10.1145/2623330.2623725
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper targets the problem of computing meaningful clusterings from uncertain data sets. Existing methods for clustering uncertain data compute a single clustering without any indication of its quality and reliability; thus, decisions based on their results are questionable. In this paper, we describe a framework, based on possible-worlds semantics; when applied on an uncertain dataset, it computes a set of representative clusterings, each of which has a probabilistic guarantee not to exceed some maximum distance to the ground truth clustering, i.e., the clustering of the actual (but unknown) data. Our framework can be combined with any existing clustering algorithm and it is the first to provide quality guarantees about its result. In addition, our experimental evaluation shows that our representative clusterings have a much smaller deviation from the ground truth clustering than existing approaches, thus reducing the effect of uncertainty.
引用
收藏
页码:243 / 252
页数:10
相关论文
共 50 条
  • [41] Data Selection for Exact Value Acquisition to Improve Uncertain Clustering
    Lin, Yu-Chieh
    Yang, De-Nian
    Chen, Ming-Syan
    WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2010, 6184 : 459 - +
  • [43] Constraint Based Subspace Clustering for High Dimensional Uncertain Data
    Zhang, Xianchao
    Gao, Lu
    Yu, Hong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT II, 2016, 9652 : 271 - 282
  • [44] Uncertain Data Clustering in Distributed Peer-to-Peer Networks
    Zhou, Jin
    Chen, Long
    Chen, C. L. Philip
    Wang, Yingxu
    Li, Han-Xiong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) : 2392 - 2406
  • [45] Uncertain Data Clustering Based on Probability Distribution in Obstacle Space
    Jing Wan
    Meiyu Cui
    Yunbin He
    Song Li
    Wireless Personal Communications, 2020, 111 : 2191 - 2214
  • [46] Authentication of uncertain data based on k-means clustering
    Unver, Levent
    Gundem, Taflan I.
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (04) : 2910 - 2928
  • [47] Incremental clustering algorithm based on representative points and covariance for large data
    Li J.
    Wu Q.
    Li L.
    Sun R.
    Mu H.
    Zhao K.
    International Journal of Simulation and Process Modelling, 2023, 20 (02) : 113 - 124
  • [48] Clustering uncertain trajectories
    Nikos Pelekis
    Ioannis Kopanakis
    Evangelos E. Kotsifakos
    Elias Frentzos
    Yannis Theodoridis
    Knowledge and Information Systems, 2011, 28 : 117 - 147
  • [49] Clustering uncertain trajectories
    Pelekis, Nikos
    Kopanakis, Ioannis
    Kotsifakos, Evangelos E.
    Frentzos, Elias
    Theodoridis, Yannis
    KNOWLEDGE AND INFORMATION SYSTEMS, 2011, 28 (01) : 117 - 147
  • [50] Spectral Clustering based on JS']JS-divergence for Uncertain Data
    Wang, Yingxu
    Dong, Jiwen
    Zhou, Jin
    Wang, Lin
    Han, Shiyuan
    Zhang, Tong
    Chen, C. L. Philip
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 1972 - 1975