A New Distance Metric for Unsupervised Learning of Categorical Data

被引:0
作者
Jia, Hong [1 ]
Cheung, Yiu-ming [1 ,2 ]
机构
[1] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[2] BNU HKBU, United Int Coll, Zhuhai, Peoples R China
来源
PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2014年
关键词
DISSIMILARITY MEASURE; SIMILARITY; ATTRIBUTE; ASSOCIATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distance metric is the basis of many learning algorithms and its effectiveness usually has significant influence on the learning results. Generally, measuring distance for numerical data is a tractable task, but for categorical data sets, it could be a nontrivial problem. This paper therefore presents a new distance metric for categorical data based on the characteristics of categorical values. Specifically, the distance between two values from one attribute measured by this metric is determined by both of the frequency probabilities of these two values and the values of other attributes which have high interdependency with the calculated one. Promising experimental results on different real data sets have shown the effectiveness of proposed distance metric.
引用
收藏
页码:1893 / 1899
页数:7
相关论文
共 26 条