Case Study of Scientific Data Processing on a Cloud Using Hadoop

被引:0
|
作者
Zhang, Chen [1 ]
De Sterck, Hans [2 ]
Aboulnaga, Ashraf [1 ]
Djambazian, Haig [3 ]
Sladek, Rob
机构
[1] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON N2L 3G1, Canada
[2] Univ Waterloo, Dept Appl Math, Waterloo, ON N2L 3G1, Canada
[3] McGill Univ, Genome Quebec Innovat Ctr, Montreal, PQ H3A 1A4, Canada
来源
HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS | 2010年 / 5976卷
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the increasing popularity of cloud computing, Hadoop has become a widely used open source cloud computing framework for large scale data processing. However, few efforts have been made to demonstrate the applicability of Hadoop to various real-world application scenarios in fields other than server side computations such as web indexing, etc. In this paper, we use the Hadoop cloud computing framework to develop a user application that allows processing of scientific data on clouds. A simple extension to Hadoop's MapReduce is described which allows it to handle scientific data processing problems with arbitrary input formats and explicit control over how the input is split. This approach is used to develop a Hadoop-based cloud computing application that processes sequences of microscope images of live cells, and we test its performance. It is discussed how the approach can be generalized to more complicated scientific data processing problems.
引用
收藏
页码:400 / +
页数:3
相关论文
共 50 条
  • [21] Research on Database Massive Data Processing and Mining Method based on Hadoop Cloud Platform
    Wu, Dan
    Li, Zhuorong
    Bie, Rongfang
    Zhou, Mingquan
    2014 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI 2014), 2014, : 107 - 110
  • [22] An overview and an Approach for Graph Data Processing using Hadoop MapReduce
    Talan, Pooja P.
    Sharma, Kartik U.
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 59 - 63
  • [23] Metecloud: A private cloud platform for meteorological data storage using hadoop
    Xiaolong, X. (xlxu1988@gmail.com), 1600, Exeley Inc (06):
  • [24] METECLOUD: A PRIVATE CLOUD PLATFORM FOR METEOROLOGICAL DATA STORAGE USING HADOOP
    Xue Shengjun
    Xu Xiaolong
    Wang Delong
    Zhang Jie
    Ji Feng
    INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, 2013, 6 (02): : 648 - 663
  • [25] Scaling Archived Social Media Data Analysis using a Hadoop Cloud
    Conejero, Javier
    Burnap, Peter
    Rana, Omer
    Morgan, Jeffrey
    2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 685 - 692
  • [26] Enhancing Performance of Hadoop and Mapreduce for Scientific Data using NoSQL Database
    Alshammari, Hamoud
    Bajwa, Hassan
    Lee, Jeongkyu
    2015 IEEE LONG ISLAND SYSTEMS, APPLICATIONS AND TECHNOLOGY CONFERENCE (LISAT), 2015,
  • [27] Comparison of Data Processing Tools in Hadoop
    Sachdeva, Karan
    Lamba, Japtej Singh
    Sinha, Vishal
    Singh, Neetu
    2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2016, : 238 - 242
  • [28] Processing RDF Using Hadoop
    Ali, Mehreen
    Bharat, K. Sriram
    Ranichandra, C.
    ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, VOL 2, 2013, 177 : 385 - 394
  • [29] 'Big data', Hadoop and cloud computing in genomics
    O'Driscoll, Aisling
    Daugelaite, Jurate
    Sleator, Roy D.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (05) : 774 - 781
  • [30] Data Prefetching for Scientific Workflow Based on Hadoop
    Chen, Gaozhao
    Wu, Shaochun
    Gu, Rongrong
    Xu, Yongquan
    Xu, Lingyu
    Ge, Yunwen
    Song, Cuicui
    COMPUTER AND INFORMATION SCIENCE 2012, 2012, 429 : 81 - 92