The k-Nearest Neighbor Algorithm Using MapReduce Paradigm

被引:24
作者
Anchalia, Prajesh P. [1 ]
Roy, Kaushik [1 ]
机构
[1] RV Coll Engn, Dept Comp Sci & Engn, Bangalore, Karnataka, India
来源
PROCEEDINGS FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION | 2014年
关键词
k-Nearest Neighbor; Distributed Computing; Hadoop; MapReduce; Data Mining;
D O I
10.1109/ISMS.2014.94
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data in any form is a valuable resource but more often than not data collected in the real world is completely random and unstructured. Hence, to utilize the true potential of data as a resource we must transform it in such a manner so as to retrieve meaningful information from it. Data mining fulfills this need. Today there is not only a need for efficient data mining techniques to process large volume of data but also a need for a means to meet the computational requirements to process such huge volume of data. In this paper we implement an effective data mining technique known as the k-Nearest Neighbor method on a distributed computing environment running Apache Hadoop that uses the MapReduce paradigm to process high volume data.
引用
收藏
页码:513 / 518
页数:6
相关论文
共 9 条
[1]  
Anchalia Prajesh P., 2013, 2013 INT C INF SCI A, P1
[2]  
[Anonymous], 2009, Hadoop: The Definitive Guide
[3]   Mapreduce: Simplified data processing on large clusters [J].
Dean, Jeffrey ;
Ghemawat, Sanjay .
COMMUNICATIONS OF THE ACM, 2008, 51 (01) :107-113
[4]  
Ekanayake J, 2010, P 19 ACM INT S HIGH, P810, DOI [DOI 10.1145/1851476.1851593, 10.1145/1851476.1851593]
[5]   Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience [J].
Gates, Alan F. ;
Natkovich, Olga ;
Chopra, Shubham ;
Kamath, Pradeep ;
Narayanamurthy, Shravan M. ;
Olston, Christopher ;
Reed, Benjamin ;
Srinivasan, Santhosh ;
Srivastava, Utkarsh .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (02) :1414-1425
[6]  
Ghemawat S., 2003, Operating Systems Review, V37, P29, DOI 10.1145/1165389.945450
[7]  
Junqueira F. P., 2009, P 28 ACM S PRINC DIS, P1012
[8]   Hive - A Warehousing Solution Over a Map-Reduce Framework [J].
Thusoo, Ashish ;
Sen Sarma, Joydeep ;
Jain, Namit ;
Shao, Zheng ;
Chakka, Prasad ;
Anthony, Suresh ;
Liu, Hao ;
Wyckoff, Pete ;
Murthy, Raghotham .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (02) :1626-1629
[9]  
Venner Jason., 2009, PRO HADOOP