Parallel mining of uncertain data using segmentation of data set area and Voronoi diagrams

被引:0
|
作者
Lukic, Ivica [1 ]
Hocenski, Zeljko [1 ]
Kohler, Mirko [1 ]
Galba, Tomislav [1 ]
机构
[1] Josip Juraj Strossmayer Univ Osijek, Fac Elect Engn Comp Sci & Informat Technol Osijek, Dept Comp Engn & Automat, Osijek, Croatia
关键词
Clustering algorithms; data mining; data uncertainty; Euclidean distance; parallel algorithms;
D O I
10.1080/00051144.2018.1541645
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering of uncertain objects in large uncertain databases and problem of mining uncertain data has been well studied. In this paper, clustering of uncertain objects with location uncertainty is studied. Moving objects, like mobile devices, report their locations periodically, thus their locations are uncertain and best described by a probability density function. The number of objects in a database can be large which makes the process of mining accurate data, a challenging and time consuming task. Authors will give an overview of existing clustering methods and present a new approach for data mining and parallel computing of clustering problems. All existing methods use pruning to avoid expected distance calculations. It is required to calculate the expected distance numerical integration, which is time-consuming. Therefore, a new method, called Segmentation of Data Set Area-Parallel, is proposed. In this method, a data set area is divided into many small segments. Only clusters and objects in that segment are observed. The number of segments is calculated using the number and location of clusters. The use of segments gives the possibility of parallel computing, because segments are mutually independent. Thus, each segment can be computed on multiple cores.
引用
收藏
页码:349 / 356
页数:8
相关论文
共 50 条
  • [41] "Property Phase Diagrams" for Compound Semiconductors through Data Mining
    Srinivasan, Srikant
    Rajan, Krishna
    MATERIALS, 2013, 6 (01) : 279 - 290
  • [42] Control set prediction concept using Data Mining for industrial process control
    Wojcik, Waldemar
    Gromaszek, Konrad
    PRZEGLAD ELEKTROTECHNICZNY, 2008, 84 (03): : 217 - 219
  • [43] Effect of data distribution in parallel mining of associations
    Cheung, DW
    Xiao, YQ
    DATA MINING AND KNOWLEDGE DISCOVERY, 1999, 3 (03) : 291 - 314
  • [44] Data repository of mobile applications for people with disabilities in the area of communication and language using data mining techniques
    Quisi-Peralta, D.
    Robles-Bykbaev, V.
    Saquicela-Galarza, V.
    Bernal-Merchan, E.
    Suquilanda-Cuesta, P.
    Lopez-Nores, M.
    2018 28TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATIONS AND COMPUTERS (CONIELECOMP), 2018, : 225 - 231
  • [45] Using Soft Set Theory for Mining Maximal Association Rules in Text Data
    Bay Vo
    Tam Tran
    Hong, Tzung-Pei
    Nguyen Le Minh
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2016, 22 (06) : 802 - 821
  • [46] Database Transformation To Build Data-set For Data Mining Analysis - A Review
    Chaudhari, Archana A.
    Khanuja, Harmeet Kaur
    1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 386 - 389
  • [47] Creation of Data Mining Algorithms as Functional Expression for Parallel and Distributed Execution
    Kholod, Ivan
    Petukhov, Ilya
    PARALLEL COMPUTING TECHNOLOGIES (PACT 2015), 2015, 9251 : 62 - 67
  • [48] Data Mining of Maritime Accidents at Coastal Area
    Chen Xing-wei
    Wang Zhi-ming
    NEW PERSPECTIVES ON RISK ANALYSIS AND CRISIS RESPONSE, 2009, : 581 - 585
  • [49] Social area analysis, data mining, and GIS
    Spielman, Seth E.
    Thill, Jean-Claude
    COMPUTERS ENVIRONMENT AND URBAN SYSTEMS, 2008, 32 (02) : 110 - 122
  • [50] Segmentation modeling algorithm: a novel algorithm in data mining
    Bulysheva, Larisa
    Bulyshev, Alexander
    INFORMATION TECHNOLOGY & MANAGEMENT, 2012, 13 (04) : 263 - 271