Representative Clustering of Uncertain Data

被引:21
|
作者
Zuefle, Andreas [1 ]
Emrich, Tobias [1 ]
Schmid, Klaus Arthur [1 ]
Mamoulis, Nikos [2 ]
Zimkek, Arthur [1 ]
Renz, Matthias [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Inst Informat, Munich, Germany
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
关键词
CONFIDENCE; VARIANCE;
D O I
10.1145/2623330.2623725
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper targets the problem of computing meaningful clusterings from uncertain data sets. Existing methods for clustering uncertain data compute a single clustering without any indication of its quality and reliability; thus, decisions based on their results are questionable. In this paper, we describe a framework, based on possible-worlds semantics; when applied on an uncertain dataset, it computes a set of representative clusterings, each of which has a probabilistic guarantee not to exceed some maximum distance to the ground truth clustering, i.e., the clustering of the actual (but unknown) data. Our framework can be combined with any existing clustering algorithm and it is the first to provide quality guarantees about its result. In addition, our experimental evaluation shows that our representative clusterings have a much smaller deviation from the ground truth clustering than existing approaches, thus reducing the effect of uncertainty.
引用
收藏
页码:243 / 252
页数:10
相关论文
共 50 条
  • [1] RPC: Representative possible world based consistent clustering algorithm for uncertain data
    Liu, Han
    Zhang, Xiaotong
    Zhang, Xianchao
    Li, Qimai
    Wuc, Xiao-Ming
    COMPUTER COMMUNICATIONS, 2021, 176 : 128 - 137
  • [2] Representative Query Answers on Uncertain Data
    Schmid, Klaus Arthur
    Zufle, Andreas
    SSTD '19 - PROCEEDINGS OF THE 16TH INTERNATIONAL SYMPOSIUM ON SPATIAL AND TEMPORAL DATABASES, 2019, : 140 - 149
  • [3] A Framework for Clustering Uncertain Data
    Schubert, Erich
    Koos, Alexander
    Emrich, Tobias
    Zuefle, Andreas
    Schmid, Klaus Arthur
    Zimek, Arthur
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (12): : 1977 - 1980
  • [4] Efficient clustering of uncertain data
    Ngai, Wang Kay
    Kao, Ben
    Chui, Chun Kit
    Cheng, Reynold
    Chau, Michael
    Yip, Kevin Y.
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 436 - 445
  • [5] A Model of Clustering Uncertain Data
    Yang, Zengfang
    Tang, Hewen
    CONFERENCE ON WEB BASED BUSINESS MANAGEMENT, VOLS 1-2, 2010, : 969 - 972
  • [6] Bayesian clustering with uncertain data
    Nicholls, Kath
    Kirk, Paul D. W.
    Wallace, Chris
    PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (09)
  • [7] Uncertain Centroid based Partitional Clustering of Uncertain Data
    Gullo, Francesco
    Tagarelli, Andrea
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (07): : 610 - 621
  • [8] Enhancement of Data Streaming in Clustering for Uncertain Data
    Ganatra, Jeny
    Thacker, Chintan
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND SIGNAL PROCESSING, 2018, 671 : 155 - 162
  • [9] Mixture model clustering of uncertain data
    Hamdan, H
    Govaert, G
    FUZZ-IEEE 2005: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS: BIGGEST LITTLE CONFERENCE IN THE WORLD, 2005, : 879 - 884
  • [10] Efficient clustering of uncertain data streams
    Cheqing Jin
    Jeffrey Xu Yu
    Aoying Zhou
    Feng Cao
    Knowledge and Information Systems, 2014, 40 : 509 - 539