A cloud computing system in windows azure platform for data analysis of crystalline materials

被引:6
作者
Xing, Qi [1 ]
Blaisten-Barojas, Estela [1 ,2 ]
机构
[1] George Mason Univ, Computat Mat Sci Ctr, Fairfax, VA 22030 USA
[2] George Mason Univ, Sch Phys Astron & Computat Sci, Fairfax, VA 22030 USA
基金
美国国家科学基金会;
关键词
cloud computing; Windows Azure; heterogeneous scientific workflow; machine learning; zeolite structure predictor; ACCESS-CONTROL;
D O I
10.1002/cpe.2912
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cloud computing is attracting the attention of the scientific community. In this paper, we develop a new cloud-based computing system in the Windows Azure platform that allows users to use the Zeolite Structure Predictor (ZSP) model through a Web browser. The ZSP is a novel machine learning approach for classifying zeolite crystals according to their framework type. The ZSP can categorize entries from the Inorganic Crystal Structure Database into 41 framework types. The novel automated system permits a user to calculate the vector of descriptors used by ZSP and to apply the model using the Random Forestalgorithm for classifying the input zeolite entries. The workflow presented here integrates executables in Fortran and Python for number crunching with packages such as Weka for data analytics and Jmol forWeb-based atomistic visualization in an interactive compute system accessed through the Web. The compute system is robust and easy to use. Communities of scientists, engineers, and students knowledgeable in Windows-based computing should find this new workflow attractive and easy to be implemented in scientific scenarios in which the developer needs to combine heterogeneous components. Copyright (c) 2012 John Wiley & Sons, Ltd.
引用
收藏
页码:2157 / 2169
页数:13
相关论文
共 41 条
  • [1] Akioka Sayaka, 2010, Proceedings of the 2010 IEEE 24th International Conference on Advanced Information Networking and Applications Workshops (WAINA 2010), P1029, DOI 10.1109/WAINA.2010.166
  • [2] [Anonymous], 2008, Benchmarking Amazon EC2 for High-Performance Scientific Computing
  • [3] [Anonymous], LAMMPS MOL DYN SIM
  • [4] [Anonymous], 2010, 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS), DOI DOI 10.1109/INFCOM.2010.5462196
  • [5] [Anonymous], CONCURRENCY COMPUT P
  • [6] [Anonymous], NAMD SCAL MOL DYN TH
  • [7] A View of Cloud Computing
    Armbrust, Michael
    Fox, Armando
    Griffith, Rean
    Joseph, Anthony D.
    Katz, Randy
    Konwinski, Andy
    Lee, Gunho
    Patterson, David
    Rabkin, Ariel
    Stoica, Ion
    Zaharia, Matei
    [J]. COMMUNICATIONS OF THE ACM, 2010, 53 (04) : 50 - 58
  • [8] Performance Issues in Clouds: An Evaluation of Virtual Image Propagation and I/O Paravirtualization
    Armstrong, Django
    Djemame, Karim
    [J]. COMPUTER JOURNAL, 2011, 54 (06) : 836 - 849
  • [9] Energy-Efficient Cloud Computing
    Berl, Andreas
    Gelenbe, Erol
    Di Girolamo, Marco
    Giuliani, Giovanni
    De Meer, Hermann
    Dang, Minh Quan
    Pentikousis, Kostas
    [J]. COMPUTER JOURNAL, 2010, 53 (07) : 1045 - 1051
  • [10] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32