Data Prefetching for Scientific Workflow Based on Hadoop

被引:0
|
作者
Chen, Gaozhao [1 ]
Wu, Shaochun [1 ]
Gu, Rongrong [1 ]
Xu, Yongquan [1 ]
Xu, Lingyu [1 ]
Ge, Yunwen [1 ]
Song, Cuicui [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200072, Peoples R China
来源
COMPUTER AND INFORMATION SCIENCE 2012 | 2012年 / 429卷
关键词
Hadoop; data-intensive; scientific workflow; prefetching;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data-intensive scientific workflow based on Hadoop needs huge data transfer and storage. Aiming at this problem, on the environment of an executing computer cluster which has limited computing resources, this paper adopts the way of data prefetching to hide the overhead caused by data search and transfer and reduce the delays of data access. Prefetching algorithm for data-intensive scientific workflow based on the consideration of available computing resources is proposed. Experimental results indicate that the algorithm consumes less response time and raises the efficiency.
引用
收藏
页码:81 / 92
页数:12
相关论文
共 50 条
  • [1] Data Prefetching for Heterogeneous Hadoop Cluster
    Vinutha, D. C.
    Raju, G. T.
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 554 - 558
  • [2] A kind of Prefetching Data Way to Hadoop MapReduce Environments
    Xia, Hui
    Wu, Peng
    PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND INFORMATION TECHNOLOGY APPLICATIONS, 2016, 71 : 1278 - 1283
  • [3] Prefetching-based metadata management in Advanced Multitenant Hadoop
    Minh Chau Nguyen
    Won, Heesun
    Son, Siwoon
    Gil, Myeong-Seon
    Moon, Yang-Sae
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (02) : 533 - 553
  • [4] Prefetching-based metadata management in Advanced Multitenant Hadoop
    Minh Chau Nguyen
    Heesun Won
    Siwoon Son
    Myeong-Seon Gil
    Yang-Sae Moon
    The Journal of Supercomputing, 2019, 75 : 533 - 553
  • [5] LIGHTWEIGHT WORKFLOW ENGINE BASED ON HADOOP AND OSGI
    Luo, Shengmei
    Liu, Lixia
    Yang, Juan
    Zhang, Di
    2013 5TH IEEE INTERNATIONAL CONFERENCE ON BROADBAND NETWORK & MULTIMEDIA TECHNOLOGY (IC-BNMT), 2013, : 262 - 267
  • [6] A data dependency based strategy for intermediate data storage in scientific cloud workflow systems
    Yuan, Dong
    Yang, Yun
    Liu, Xiao
    Zhang, Gaofeng
    Chen, Jinjun
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (09) : 956 - 976
  • [7] Data Management for the RedisDG Scientific Workflow Engine
    Abidi, Leila
    Bejaoui, Souha
    Cerin, Christophe
    Lejeune, Jonathan
    Ngoko, Yanik
    Saad, Walid
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2016, : 599 - 606
  • [8] Automated Knowledge Extraction Based on Scientific Workflow for Satellite Remote Sensing Data
    Fang Chaoyang
    Lin Hui
    Zhang Junxian
    GEOINFORMATICS 2008 AND JOINT CONFERENCE ON GIS AND BUILT ENVIRONMENT: ADVANCED SPATIAL DATA MODELS AND ANALYSES, PARTS 1 AND 2, 2009, 7146
  • [9] SciLedger: A Blockchain-based Scientific Workflow Provenance and Data Sharing Platform
    Hoopes, Reagan
    Hardy, Hamilton
    Long, Min
    Dagher, Gaby G.
    2022 IEEE 8TH INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING, CIC, 2022, : 125 - 134
  • [10] A Group Based Genetic Algorithm Data Replica Placement Strategy For Scientific Workflow
    Liu, Lihui
    Yang, Ying
    Wang, Haibo
    Tan, Zhifei
    Li, Chen
    2017 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2017), 2017, : 459 - 464