Representative Clustering of Uncertain Data

被引:21
|
作者
Zuefle, Andreas [1 ]
Emrich, Tobias [1 ]
Schmid, Klaus Arthur [1 ]
Mamoulis, Nikos [2 ]
Zimkek, Arthur [1 ]
Renz, Matthias [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Inst Informat, Munich, Germany
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
关键词
CONFIDENCE; VARIANCE;
D O I
10.1145/2623330.2623725
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper targets the problem of computing meaningful clusterings from uncertain data sets. Existing methods for clustering uncertain data compute a single clustering without any indication of its quality and reliability; thus, decisions based on their results are questionable. In this paper, we describe a framework, based on possible-worlds semantics; when applied on an uncertain dataset, it computes a set of representative clusterings, each of which has a probabilistic guarantee not to exceed some maximum distance to the ground truth clustering, i.e., the clustering of the actual (but unknown) data. Our framework can be combined with any existing clustering algorithm and it is the first to provide quality guarantees about its result. In addition, our experimental evaluation shows that our representative clusterings have a much smaller deviation from the ground truth clustering than existing approaches, thus reducing the effect of uncertainty.
引用
收藏
页码:243 / 252
页数:10
相关论文
共 50 条
  • [31] Novel Density-Based Clustering Algorithms for Uncertain Data
    Zhang, Xianchao
    Liu, Han
    Zhang, Xiaotong
    Liu, Xinyue
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2191 - 2197
  • [32] EMU: An expectation maximization based approach for clustering uncertain data
    Qin, Biao
    Xia, Yuni
    Li, Fang
    Ge, Jiaqi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2013, 25 (04) : 1067 - 1083
  • [33] Positioning of Public Service Systems Using Uncertain Data Clustering
    Lukic, Ivica
    Koehler, Mirko
    Slavek, Ninoslav
    ACTA POLYTECHNICA HUNGARICA, 2014, 11 (01) : 121 - 133
  • [34] Clustering data in an uncertain environment using an artificial immune system
    Graaff, A. J.
    Engelbrecht, A. P.
    PATTERN RECOGNITION LETTERS, 2011, 32 (02) : 342 - 351
  • [35] Clustering uncertain data based on probability attribute value similarity
    College of Information Science and Engineering, Yanshan University, 066004, China
    不详
    Intl. J. Adv. Comput. Technolog., 22 (417-426):
  • [36] An information-theoretic approach to hierarchical clustering of uncertain data
    Gullo, Francesco
    Ponti, Giovanni
    Tagarelli, Andrea
    Greco, Sergio
    INFORMATION SCIENCES, 2017, 402 : 199 - 215
  • [37] Density-based clustering for evolving uncertain data stream
    He, Haitao
    Zhao, Jintian
    Journal of Computational Information Systems, 2014, 10 (01): : 419 - 426
  • [38] Uncertain Data Clustering Based on Probability Distribution in Obstacle Space
    Wan, Jing
    Cui, Meiyu
    He, Yunbin
    Li, Song
    WIRELESS PERSONAL COMMUNICATIONS, 2020, 111 (04) : 2191 - 2214
  • [39] Multi-dimensional uncertain data stream clustering algorithm
    Luo, Qinghua
    Peng, Yu
    Peng, Xiyuan
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2013, 34 (06): : 1330 - 1338
  • [40] A Survey of Clustering Uncertain Data Based Probability Distribution Similarity
    Geetha, S.
    Shyla, E. Mary
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2014, 14 (09): : 77 - 81