Parallel mining of uncertain data using segmentation of data set area and Voronoi diagrams

被引:0
|
作者
Lukic, Ivica [1 ]
Hocenski, Zeljko [1 ]
Kohler, Mirko [1 ]
Galba, Tomislav [1 ]
机构
[1] Josip Juraj Strossmayer Univ Osijek, Fac Elect Engn Comp Sci & Informat Technol Osijek, Dept Comp Engn & Automat, Osijek, Croatia
关键词
Clustering algorithms; data mining; data uncertainty; Euclidean distance; parallel algorithms;
D O I
10.1080/00051144.2018.1541645
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering of uncertain objects in large uncertain databases and problem of mining uncertain data has been well studied. In this paper, clustering of uncertain objects with location uncertainty is studied. Moving objects, like mobile devices, report their locations periodically, thus their locations are uncertain and best described by a probability density function. The number of objects in a database can be large which makes the process of mining accurate data, a challenging and time consuming task. Authors will give an overview of existing clustering methods and present a new approach for data mining and parallel computing of clustering problems. All existing methods use pruning to avoid expected distance calculations. It is required to calculate the expected distance numerical integration, which is time-consuming. Therefore, a new method, called Segmentation of Data Set Area-Parallel, is proposed. In this method, a data set area is divided into many small segments. Only clusters and objects in that segment are observed. The number of segments is calculated using the number and location of clusters. The use of segments gives the possibility of parallel computing, because segments are mutually independent. Thus, each segment can be computed on multiple cores.
引用
收藏
页码:349 / 356
页数:8
相关论文
共 50 条
  • [1] Market segmentation through data mining: A method to extract behaviors from a noisy data set
    Murray, Paul W.
    Agard, Bruno
    Barajas, Marco A.
    COMPUTERS & INDUSTRIAL ENGINEERING, 2017, 109 : 233 - 252
  • [2] Parallel and Distributed Data Mining in Cloud
    Kholod, Ivan
    Kuprianov, Mikhail
    Petukhov, Ilya
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, 2016, 9728 : 349 - 362
  • [3] Modelling Customer Churn Using Segmentation and Data Mining
    Hiziroglu, Abdulkadir
    Seymen, Omer Faruk
    DATABASES AND INFORMATION SYSTEMS VIII, 2014, 270 : 259 - 271
  • [4] The impact of big data market segmentation using data mining and clustering techniques
    Yoseph, Fahed
    Malim, Nurul Hashimah Ahamed Hassain
    Heikkila, Markku
    Brezulianu, Adrian
    Geman, Oana
    Rostam, Nur Aqilah Paskhal
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 6159 - 6173
  • [5] A Development of Customer Segmentation by Using Data Mining Technique
    Jin, Seohoon
    KOREAN JOURNAL OF APPLIED STATISTICS, 2005, 18 (03) : 555 - 565
  • [6] Data mining on views in a parallel data server
    Sarkar, S
    Sarkar, S
    PROCEEDINGS OF THE HIGH-PERFORMANCE COMPUTING (HPC'98), 1998, : 133 - 138
  • [7] The Dynamic Data Reduction and Association Rule Parallel Mining Based on Rough Set
    He Youquan
    Wang Lijun
    2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2010, : 2803 - 2806
  • [8] Using Data Mining Tools in Wall-Following Robot Navigation Data Set
    Zdrodowska, Malgorzata
    Dardzinska, Agnieszka
    Kasperczuk, Anna
    15TH INTERNATIONAL CONFERENCE MECHATRONIC SYSTEMS AND MATERIALS, MSM'20, 2020, : 159 - 163
  • [9] Market Segmentation Using Data Mining Techniques in Social Networks
    Olarte, Eduin
    Panizzi, Marisa
    Bertone, Rodolfo
    COMPUTER SCIENCE - CACIC 2018, 2019, 995 : 221 - 231
  • [10] Data Mining of the Substation Data in Distribution Network using Rough Set and Genetic Algorithms
    Crossley, Peter
    Liu, Yi
    UPEC: 2009 44TH INTERNATIONAL UNIVERSITIES POWER ENGINEERING CONFERENCE, 2009, : 557 - 561