XHAMI - extended HDFS and MapReduce interface for Big Data image processing applications in cloud computing environments

被引:12
作者
Kune, Raghavendra [1 ]
Konugurthi, Pramod Kumar [1 ]
Agarwal, Arun [2 ]
Chillarige, Raghavendra Rao [2 ]
Buyya, Rajkumar [3 ]
机构
[1] Adv Data Proc Res Inst, Dept Space, Hyderabad 500009, Andhra Pradesh, India
[2] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad, Andhra Pradesh, India
[3] Univ Melbourne, Dept Comp & Informat Syst, Cloud Comp & Distributed Syst CLOUDS Lab, Melbourne, Vic, Australia
关键词
cloud computing; Big Data; Hadoop; MapReduce; extended MapReduce; XHAMI; image processing; scientific computing; remote sensing;
D O I
10.1002/spe.2425
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Hadoop distributed file system (HDFS) and MapReduce model have become popular technologies for large-scale data organization and analysis. Existing model of data organization and processing in Hadoop using HDFS and MapReduce are ideally tailored for search and data parallel applications, for which there is no need of data dependency with its neighboring/adjacent data. However, many scientific applications such as image mining, data mining, knowledge data mining, and satellite image processing are dependent on adjacent data for processing and analysis. In this paper, we identify the requirements of the overlapped data organization and propose a two-phase extension to HDFS and MapReduce programming model, called XHAMI, to address them. The extended interfaces are presented as APIs and implemented in the context of image processing application domain. We demonstrated effectiveness of XHAMI through case studies of image processing functions along with the results. Although XHAMI has little overhead in data storage and input/output operations, it greatly enhances the system performance and simplifies the application development process. Our proposed system, XHAMI, works without any changes for the existing MapReduce models and can be utilized by many applications where there is a requirement of overlapped data. Copyright (C) 2016 John Wiley & Sons, Ltd.
引用
收藏
页码:455 / 472
页数:18
相关论文
共 25 条
[1]  
Almeer M.H., 2012, Journal of Emerging Trends in Computing and Information Sciences, V3, P637
[2]  
[Anonymous], 2010, 19 ACM INT S HIGH PE
[3]  
[Anonymous], P 9 ACM S OP SYST PR
[4]  
Bakshi K., 2012, P 2012 IEEE AER C BI
[5]  
Cao XZ, 2012, COMM COM INF SC, V288, P127
[6]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[7]  
Demir I, 2014, INT J COMPUT COMMUN, V9, P664
[8]  
Ekanayake J, 2008, P IEEE 4 INT C E SCI
[9]  
Gonzalez RC, 2007, DIGITAL IMAGE PROCES
[10]  
Jiang W, 2010, P 10 IEEE ACM INT C