Case Study of Scientific Data Processing on a Cloud Using Hadoop

被引:0
|
作者
Zhang, Chen [1 ]
De Sterck, Hans [2 ]
Aboulnaga, Ashraf [1 ]
Djambazian, Haig [3 ]
Sladek, Rob
机构
[1] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON N2L 3G1, Canada
[2] Univ Waterloo, Dept Appl Math, Waterloo, ON N2L 3G1, Canada
[3] McGill Univ, Genome Quebec Innovat Ctr, Montreal, PQ H3A 1A4, Canada
来源
HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS | 2010年 / 5976卷
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the increasing popularity of cloud computing, Hadoop has become a widely used open source cloud computing framework for large scale data processing. However, few efforts have been made to demonstrate the applicability of Hadoop to various real-world application scenarios in fields other than server side computations such as web indexing, etc. In this paper, we use the Hadoop cloud computing framework to develop a user application that allows processing of scientific data on clouds. A simple extension to Hadoop's MapReduce is described which allows it to handle scientific data processing problems with arbitrary input formats and explicit control over how the input is split. This approach is used to develop a Hadoop-based cloud computing application that processes sequences of microscope images of live cells, and we test its performance. It is discussed how the approach can be generalized to more complicated scientific data processing problems.
引用
收藏
页码:400 / +
页数:3
相关论文
共 50 条
  • [11] A COMPARATIVE ANALYSIS OF CONVENTIONAL HADOOP WITH PROPOSED CLOUD ENABLED HADOOP FRAMEWORK FOR SPATIAL BIG DATA PROCESSING
    Tripathi, A. K.
    Agrawal, S.
    Gupta, R. D.
    ISPRS TC V MID-TERM SYMPOSIUM GEOSPATIAL TECHNOLOGY - PIXEL TO PEOPLE, 2018, 4-5 : 425 - 430
  • [12] Study on Data Processing of the IOT Sensor Network Based on a Hadoop Cloud Platform and a TWLGA Scheduling Algorithm
    Li, Guoyu
    Yang, Kang
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2021, 17 (06): : 1035 - 1043
  • [13] Parallel Processing of Image Segmentation Data Using Hadoop
    Akhtar, M. Nishat
    Saleh, Junita Mohamad
    Grelck, C.
    INTERNATIONAL JOURNAL OF INTEGRATED ENGINEERING, 2018, 10 (01): : 74 - 84
  • [14] Novel Weather Data Analysis Using Hadoop and MapReduce - A Case Study
    Suryanarayana, V.
    Sathish, B. S.
    Ranganayakulu, A.
    Ganesan, P.
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 204 - 207
  • [15] Extensions to the Pig Data Processing Platform for Scalable RDF Data Processing Using Hadoop
    Tanimura, Yusuke
    Matono, Akiyoshi
    Lynden, Steven
    Kojima, Isao
    2010 IEEE 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDE 2010), 2010, : 251 - 256
  • [16] An Optimized Cloud Based Big Data Processing Mechanism Using Self-Organizing Map in Hadoop Environments
    Iyer, Girish Neelakanta
    Silas, Salaja
    Iyer, Ganesh
    2015 International Conference on Green Computing and Internet of Things (ICGCIoT), 2015, : 244 - 246
  • [17] Application of NLP on Big Data Using Hadoop: Case Study Using Trouble Tickets
    Yayah, Fauzy Che
    Ghauth, Khairil Imran
    Ting, Choo-Yee
    ADVANCED SCIENCE LETTERS, 2018, 24 (10) : 7696 - 7702
  • [18] Design of big data processing system architecture based on Hadoop Under the cloud computing
    Duan, Chunmei
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 6302 - 6306
  • [19] Research on database massive data processing and mining method based on hadoop cloud platform
    Xiaoyong, Zhao
    Chunrong, Yang
    Open Automation and Control Systems Journal, 2014, 6 (01): : 1463 - 1467
  • [20] Exploring the Data Processing Practices of Cloud ERP-A Case Study
    Gao, Lei
    JOURNAL OF EMERGING TECHNOLOGIES IN ACCOUNTING, 2020, 17 (01) : 63 - 70