Clustering Categorical Data Using Hierarchies (CLUCDUH)

被引:0
作者
Silahtaroglu, Gökhan [1 ]
机构
[1] Beykent University, Department of Mathematics and Computing, Istanbul 34900, Turkey
来源
World Academy of Science, Engineering and Technology | 2009年 / 56卷
关键词
Clustering; -; Gini; Pruning; Split; Tree;
D O I
暂无
中图分类号
学科分类号
摘要
Clustering large populations is an important problem when the data contain noise and different shapes. A good clustering algorithm or approach should be efficient enough to detect clusters sensitively. Besides space complexity, time complexity also gains importance as the size grows. Using hierarchies we developed a new algorithm to split attributes according to the values they have and choosing the dimension for splitting so as to divide the database roughly into equal parts as much as possible. At each node we calculate some certain descriptive statistical features of the data which reside and by pruning we generate the natural clusters with a complexity of O(n).
引用
收藏
页码:334 / 339
相关论文
共 50 条
[41]   An Integrated Clustering Approach for High Dimensional Categorical Data [J].
Kalaivani, K. ;
Raghavendra, A. P. V. .
2013 IEEE INTERNATIONAL CONFERENCE ON GREEN HIGH PERFORMANCE COMPUTING (ICGHPC), 2013,
[42]   Performances of parallel clustering algorithm for categorical and mixed data [J].
Hai, NTM ;
Susumu, H .
PARALLEL AND DISTRIBUTED COMPUTING: APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2004, 3320 :252-256
[43]   CLUSTERING CATEGORICAL DATA BASED ON COMBINATIONS OF ATTRIBUTE VALUES [J].
Do, Hee-Jung ;
Kim, Jae Yearn .
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (12A) :4393-4405
[44]   Clustering mixed numerical and categorical data with missing values [J].
Dinh, Duy-Tai ;
Huynh, Van-Nam ;
Sriboonchitta, Songsak .
INFORMATION SCIENCES, 2021, 571 :418-442
[45]   Apply clustering to analyze categorical data in longitudinal studies [J].
Hassan, Mohammad Mahdi ;
Blom, Martin ;
Ansari, Gufran Ahmad .
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2019, 19 (04) :10-19
[46]   A fair-multicluster approach to clustering of categorical data [J].
Carlos Santos-Mangudo ;
Antonio J. Heras .
Central European Journal of Operations Research, 2023, 31 :583-604
[47]   A hybrid data transformation approach for privacy preserving clustering of categorical data [J].
Natarajan, A. M. ;
Rajalaxmi, R. R. ;
Uma, N. ;
Kirubhakar, G. .
INNOVATIONS AND ADVANCED TECHNIQUES IN COMPUTER AND INFORMATION SCIENCES AND ENGINEERING, 2007, :403-408
[48]   An Efficient Approach for Clustering US Census Data Based on Cluster Similarity Using Rough Entropy on Categorical Data [J].
Sreenivasulu, G. ;
Raju, S. Viswanadha ;
Rao, N. Sambasiva .
INFORMATION AND COMMUNICATION TECHNOLOGY FOR COMPETITIVE STRATEGIES, 2019, 40 :359-375
[49]   Rough set approach for clustering categorical data using information-theoretic dependency measure [J].
Park, In-Kyoo ;
Choi, Gyoo-Seok .
INFORMATION SYSTEMS, 2015, 48 :289-295
[50]   Efficient layered density-based clustering of categorical data [J].
Andreopoulos, Bill ;
An, Aijun ;
Wang, Xiaogang ;
Labudde, Dirk .
JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (02) :365-376