Nearest Neighbor-Based Classification of Uncertain Data

被引:23
作者
Angiulli, Fabrizio [1 ]
Fassetti, Fabio [1 ]
机构
[1] Univ Calabria, DIMES, I-87030 Commenda Di Rende, Italy
关键词
Algorithms; Classification; uncertain data; nearest neighbor rule; probability density functions; nearest neighbor; SEARCH; TREES;
D O I
10.1145/2435209.2435210
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work deals with the problem of classifying uncertain data. With this aim we introduce the Uncertain Nearest Neighbor (UNN) rule, which represents the generalization of the deterministic nearest neighbor rule to the case in which uncertain objects are available. The UNN rule relies on the concept of nearest neighbor class, rather than on that of nearest neighbor object. The nearest neighbor class of a test object is the class that maximizes the probability of providing its nearest neighbor. The evidence is that the former concept is much more powerful than the latter in the presence of uncertainty, in that it correctly models the right semantics of the nearest neighbor decision rule when applied to the uncertain scenario. An effective and efficient algorithm to perform uncertain nearest neighbor classification of a generic (un)certain test object is designed, based on properties that greatly reduce the temporal cost associated with nearest neighbor class probability computation. Experimental results are presented, showing that the UNN rule is effective and efficient in classifying uncertain data.
引用
收藏
页数:35
相关论文
共 42 条
  • [1] Online hierarchical clustering in a data warehouse environment
    Achtert, E
    Böhm, C
    Kriegel, HP
    Kröger, P
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 10 - 17
  • [2] Indexing Uncertain Data
    Agarwal, Pankaj K.
    Cheng, Siu-Wing
    Tao, Yufei
    Yi, Ke
    [J]. PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, : 137 - 146
  • [3] Aggarwal C., 2009, ADVANCES IN DATABASE, V35
  • [4] Aggarwal C., 2007, PROCEEDINGS OF ICDE
  • [5] Aggarwal C. C., 2008, SDM, P483, DOI 10.1137/1.9781611972788.44
  • [6] A Survey of Uncertain Data Algorithms and Applications
    Aggarwal, Charu C.
    Yu, Philip S.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (05) : 609 - 623
  • [7] Indexing Uncertain Data in General Metric Spaces
    Angiulli, Fabrizio
    Fassetti, Fabio
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (09) : 1640 - 1657
  • [8] [Anonymous], 2006, IEEE Date Eng. Bull.
  • [9] [Anonymous], 2004, Proceedings of the Thirtieth international conference on Very large data bases-Volume
  • [10] [Anonymous], 1990, P 1990 ACM SIGMOD IN, DOI DOI 10.1145/93597.98741