Distributed Hierarchal Clustering Algorithm Utilizing a Distance Matrix

被引:0
作者
Yarmish, Gavriel [1 ]
Listowsky, Philip [1 ]
Dexter, Simon [1 ]
机构
[1] CUNY Brooklyn Coll, Brooklyn, NY 11210 USA
来源
PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI) | 2017年
关键词
Hierarchal Clustering; Distributed Computing; Parallel; K-means; Data Mining;
D O I
10.1109/CSCI.2017.282
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dividing similar objects into a smaller number of clusters is of importance in many applications. These include search engines, monitoring of academic performance, biology and wireless networks. We first discuss a number of clustering methods. We present a parallel algorithm for the efficient clustering of proteins into groups. The input consists of an n by n distance matrix. This matrix would be built differently for different applications. A two simple points in space can have the Euclidean distance in the matrix. As another example, the Root-Mean-Square-Deviations (RMSD) values can be computed for any two 3-D structures and used and the distance between them. The second step is to utilize parallel processors to calculate a hierarchal cluster of these n items based on this matrix. We have implemented our algorithm and have found it to be scalable.
引用
收藏
页码:1624 / 1628
页数:5
相关论文
共 9 条