Parallel mining of uncertain data using segmentation of data set area and Voronoi diagrams

被引:0
|
作者
Lukic, Ivica [1 ]
Hocenski, Zeljko [1 ]
Kohler, Mirko [1 ]
Galba, Tomislav [1 ]
机构
[1] Josip Juraj Strossmayer Univ Osijek, Fac Elect Engn Comp Sci & Informat Technol Osijek, Dept Comp Engn & Automat, Osijek, Croatia
关键词
Clustering algorithms; data mining; data uncertainty; Euclidean distance; parallel algorithms;
D O I
10.1080/00051144.2018.1541645
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering of uncertain objects in large uncertain databases and problem of mining uncertain data has been well studied. In this paper, clustering of uncertain objects with location uncertainty is studied. Moving objects, like mobile devices, report their locations periodically, thus their locations are uncertain and best described by a probability density function. The number of objects in a database can be large which makes the process of mining accurate data, a challenging and time consuming task. Authors will give an overview of existing clustering methods and present a new approach for data mining and parallel computing of clustering problems. All existing methods use pruning to avoid expected distance calculations. It is required to calculate the expected distance numerical integration, which is time-consuming. Therefore, a new method, called Segmentation of Data Set Area-Parallel, is proposed. In this method, a data set area is divided into many small segments. Only clusters and objects in that segment are observed. The number of segments is calculated using the number and location of clusters. The use of segments gives the possibility of parallel computing, because segments are mutually independent. Thus, each segment can be computed on multiple cores.
引用
收藏
页码:349 / 356
页数:8
相关论文
共 50 条
  • [11] POP: A Parallel Optimized Preparation of Data for Data Mining
    Ernst, Christian
    Hmamouche, Youssef
    Casali, Alain
    2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 36 - 45
  • [12] Optimization of Big Data Using Rough Set Theory and Data Mining for Textile Applications
    Bhuvaneshwarri, I
    Tamilarasi, A.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, 2020, 1056 : 69 - 77
  • [13] A Parallel DistributedWeka Framework for Big Data Mining using Spark
    Koliopoulos, Aris-Kyriakos
    Yiapanis, Paraskevas
    Tekiner, Firat
    Nenadic, Goran
    Keane, John
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 9 - 16
  • [14] Parallel Data Mining Optimal Algorithm of Virtual Cluster
    Wang, Jing
    Liu, Zhijing
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 5, PROCEEDINGS, 2008, : 358 - 362
  • [15] Improved Bisector Pruning for Uncertain Data Mining
    Lukic, Ivica
    Kohler, Mirko
    Slavek, Ninoslav
    PROCEEDINGS OF THE ITI 2012 34TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES (ITI), 2012, : 355 - 360
  • [16] Mining maximal frequent itemsets in uncertain data
    Tang, Xianghong
    Yang, Quanwei
    Zheng, Yang
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2015, 43 (09): : 29 - 34
  • [17] Frequent Itemsets Mining on Weighted Uncertain Data
    Alharbi, Manal
    Pathak, Sudipta
    Rajasekaran, Sanguthevar
    2014 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2014, : 201 - 206
  • [18] Novel Data Segmentation Techniques for Efficient Discovery of Correlated Patterns Using Parallel Algorithms
    Kotni, Amulya
    Kiran, R. Uday
    Toyoda, Masashi
    Reddy, P. Krishna
    Kitsuregawa, Masaru
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2018), 2018, 11031 : 355 - 370
  • [19] Rain-Area Identification Using TRMM/TMI Data by Data Mining Approach
    Chen, Shan-Tai
    Wu, Chien-Chen
    Chen, Wann-Jin
    Hu, Jen-Chi
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2008, 12 (03) : 243 - 248
  • [20] Parallel Data Mining on Multicore Clusters
    Qiu, Xiaohong
    Fox, Geoffrey
    Yuan, Huapeng
    Bae, Seung-Hee
    Chrysanthakopoulos, George
    Nielsen, Henrik
    GCC 2008: SEVENTH INTERNATIONAL CONFERENCE ON GRID AND COOPERATIVE COMPUTING, PROCEEDINGS, 2008, : 41 - +