Research of large scale manifold learning based on MapReduce

被引:0
作者
Xue, Yong-Jian [1 ,2 ]
Ni, Zhi-Wei [1 ,2 ]
机构
[1] School of Management, Hefei University of Technology, Hefei,230009, China
[2] The MOE Key Laboratory of Process Optimization and Intelligent Decision-Making, Hefei,230009, China
来源
Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice | 2014年 / 34卷
关键词
Data reduction - Learning algorithms - Geodesy - Data mining - Hash functions - Learning systems - Artificial intelligence;
D O I
暂无
中图分类号
学科分类号
摘要
With the rapid development of the information technology, it is challenging for the traditional machine learning and data mining algorithms to deal with large scale explosive growth data. Manifold learning is a dimensionality reduction algorithm which can overcome some shortages of traditional linear dimensionality reduction methods. However, it is not useful for large scale data because of high complexity. In order to deal with the dimensionality reduction of large scale data, a distributed manifold learning algorithm is proposed based on MapReduce. Local sensitive hash functions are used to map the similarity points to the same bucket, then the geodesic distance between points in the same bucket can be computed by Euclidean norm according to the local homeomorphisms of Euclidean spaces of manifold and the geodesic distance among points between buckets can be computed by the modified geodesic distance formula which takes use of central points and edge points. Experiments on large scale of manmade dataset and real dataset show that this distributed manifold learning algorithm can approximate the geodesic distance between points effectively and it is useful for large scale dimensionality reduction.
引用
收藏
页码:151 / 157
相关论文
empty
未找到相关数据