StreamSVC: A New Approach To Cluster Large And High-Dimensional Data Streams

被引:0
作者
Saberi, Hasan [1 ]
Mehdiaghaei, Mohammadali [2 ]
机构
[1] ShahidBeheshti Univ Tehran, Dept ComputerSci, Tehran, Iran
[2] Azad Univ Tehran, Dept Comp Engn, Cental Branch, Tehran, Iran
来源
WORLD CONGRESS ON ENGINEERING, WCE 2011, VOL III | 2011年
关键词
Data stream; Clustering; SVC; Labeling piece;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The data stream mining has been studied extensively in recent years. This paper is introducing a novel method to cluster high-dimensional data streams, based on famous SVC method, named StreamSVC. SVC projects the images of the data points in a high dimensional feature space, to search for the minimal enclosing sphere, then classifies the points with respect to the distance between each point's image and the central of feature sphere. In StreamSVC, for a single change in the data stream environment, the algorithm redoes the classification part. The algorithm involves only the parts of the data set which are affected during the change of stream and updates the classes in an appropriate time complexity order. Also, in order to update the clusters, in the stream process, we used some new improvements in the labeling piece of original SVC. These improvements are applied to reduce the computational costs for classification part and the cluster's labeling piece. The experimental results show both time efficiency and high accuracy for large data streams.
引用
收藏
页码:1865 / 1870
页数:6
相关论文
共 50 条
  • [21] High-Dimensional Inference for Cluster-Based Graphical Models
    Eisenach, Carson
    Bunea, Florentina
    Ning, Yang
    Dinicu, Claudiu
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [22] Estimating the Number of Clusters in High-Dimensional Large Datasets
    Zhu, Xutong
    Li, Lingli
    [J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2023, 19 (02)
  • [23] A Novel Density-Based Clustering Approach for Outlier Detection in High-Dimensional Data
    Messaoud, Thouraya Aouled
    Smiti, Abir
    Louati, Aymen
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 : 322 - 331
  • [24] An Effective Method to Analyze Variations of High-dimensional Patterns over Medical Streams
    Tang, Yan
    Li, Hongyan
    Li, Feifei
    Fan, Lilue
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2013,
  • [25] Unsupervised identification and recognition of situations for high-dimensional sensori-motor streams
    Heinerman, Jacqueline
    Haasdijk, Evert
    Eiben, A. E.
    [J]. NEUROCOMPUTING, 2017, 262 : 90 - 107
  • [26] GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases
    Pilevar, AH
    Sukumar, M
    [J]. PATTERN RECOGNITION LETTERS, 2005, 26 (07) : 999 - 1010
  • [27] Flexible High-Dimensional Unsupervised Learning with Missing Data
    Wei, Yuhong
    Tang, Yang
    McNicholas, Paul D.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (03) : 610 - 621
  • [28] Iterative random projections for high-dimensional data clustering
    Cardoso, Angelo
    Wichert, Andreas
    [J]. PATTERN RECOGNITION LETTERS, 2012, 33 (13) : 1749 - 1755
  • [29] Fuzzy nearest neighbor clustering of high-dimensional data
    Wang, HB
    Yu, YQ
    Zhou, DR
    Meng, B
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2569 - 2572
  • [30] Visual cluster separation using high-dimensional sharpened dimensionality reduction
    Kim, Youngjoo
    Telea, Alexandru C.
    Trager, Scott C.
    Roerdink, Jos B. T. M.
    [J]. INFORMATION VISUALIZATION, 2022, 21 (03) : 246 - 269