[2] Poznan Supercomp & Networking Ctr PSNC, Poznan, Poland
[3] Lawrence Livermore Natl Lab LLNL, San Francisco, CA USA
[4] Italian Natl Inst Nucl Phys INFN, Bologna, Italy
[5] Univ Politecn Valencia UPV, Valencia, Spain
[6] Univ Catania, Catania, Italy
[7] Lab Instrumentacao & Fis Expt Particulas LIP, Lisbon, Portugal
[8] Oak Ridge Natl Lab ORNL, Oak Ridge, TN USA
[9] Univ Salento, Lecce, Italy
来源:
2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)
|
2016年
关键词:
big analytics;
workflow management;
cloud computing;
ESGF;
INDIGO-DataCloud;
D O I:
10.1109/BigData.2016.7840941
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
A case study on climate models intercomparison data analysis addressing several classes of multi-model experiments is being implemented in the context of the EU H2020 INDIGO-DataCloud project. Such experiments require the availability of large amount of data (multi-terabyte order) related to the output of several climate models simulations as well as the exploitation of scientific data management tools for large-scale data analytics. More specifically, the paper discusses in detail a use case on precipitation trend analysis in terms of requirements, architectural design solution, and infrastructural implementation. The experiment has been tested and validated on CMIP5 datasets, in the context of a large scale distributed testbed across EU and US involving three ESGF sites (LLNL, ORNL, and CMCC) and one central orchestrator site (PSNC).